GenDRAM:Hardware-Software Co-Design of General Platform in DRAM

πŸ“… 2026-02-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitations of dynamic programming algorithms on conventional architectures, where performance is constrained by data movement, and existing in-memory computing accelerators only partially cover the computational pipeline, shifting bottlenecks to host communication and off-chip transfers. To overcome these challenges, the authors propose GenDRAMβ€”a massively parallel in-memory computing accelerator based on monolithic 3D DRAM that integrates, for the first time on a single chip, a complete genomic analysis pipeline spanning from seed generation to alignment. GenDRAM features a hybrid architecture combining a dedicated Search Processing Unit for efficient search operations and a multiplication-free general-purpose Compute Processing Unit for flexible computation, along with a 3D-aware data mapping strategy tailored to the latency characteristics of the memory hierarchy. Simulations demonstrate that GenDRAM achieves over 68Γ— speedup on all-pairs shortest path (APSP) tasks compared to state-of-the-art GPUs and accelerates end-to-end genomic analysis by more than 22Γ—.

Technology Category

Application Category

πŸ“ Abstract
Dynamic programming (DP) algorithms, such as All-Pairs Shortest Path (APSP) and genomic sequence alignment, are fundamental to many scientific domains but are severely bottlenecked by data movement on conventional architectures. While Processing-in-Memory (PIM) offers a promising solution, existing accelerators often address only a fraction of the work-flow, creating new system-level bottlenecks in host-accelerator communication and off-chip data streaming. In this work, we propose GenDRAM, a massively parallel PIM accelerator that overcomes these limitations. GenDRAM leverages the immense capacity and internal bandwidth of monolithic 3D DRAM(M3D DRAM) to integrate entire data-intensive pipelines, such as the full genomics workflow from seeding to alignment, onto a single heterogeneous chip. At its core is a novel architecture featuring specialized Search PUs for memory-intensive tasks and universal, multiplier-less Compute PUs for diverse DP calculations. This is enabled by a 3D-aware data mapping strategy that exploits the tiered latency of M3D DRAM for performance optimization. Through comprehensive simulation, we demonstrate that GenDRAM achieves a transformative performance leap, outperforming state-of-the-art GPU systems by over 68x on APSP and over 22x on the end-to-end genomics pipeline.
Problem

Research questions and friction points this paper is trying to address.

Dynamic Programming
Processing-in-Memory
Data Movement Bottleneck
Genomic Sequence Alignment
System-level Bottleneck
Innovation

Methods, ideas, or system contributions that make the work stand out.

Processing-in-Memory
3D DRAM
Hardware-Software Co-Design
Dynamic Programming
Genomics Acceleration
πŸ”Ž Similar Papers
No similar papers found.