CiMBA: Accelerating Genome Sequencing through On-Device Basecalling via Compute-in-Memory

📅 2025-04-09
🏛️ IEEE Transactions on Parallel and Distributed Systems
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Genome real-time basecalling faces dual challenges: high data transmission overhead (0.5 GB/min) and computational bottlenecks (accounting for >40% of total analysis time). To address these, this work proposes the first in-memory computing (IMC)-based on-chip basecalling architecture, enabling direct mapping of raw electrical signals to DNA sequences within memory—eliminating off-chip data transfer. We introduce CiMBA, an embedded IMC accelerator, and the analog-optimized AL-Dorado neural network family. A hardware–software co-design framework enables hardware-aware compression, quantization, and analog-domain DNN inference. Implemented on a 25 mm² custom ASIC, our solution achieves 4.77 Mbps throughput (24× real-time), 17× energy efficiency improvement, and 27× area efficiency gain—while matching Dorado software’s accuracy—thereby significantly advancing the energy-efficiency and latency limits of edge-based real-time basecalling.

Technology Category

Application Category

📝 Abstract
As genome sequencing is finding utility in a wide variety of domains beyond the confines of traditional medical settings, its computational pipeline faces two significant challenges. First, the creation of up to 0.5 GB of data per minute imposes substantial communication and storage overheads. Second, the sequencing pipeline is bottlenecked at the basecalling step, consuming>40% of genome analysis time. A range of proposals have attempted to address these challenges, with limited success. We propose to address these challenges with a Compute-in-Memory Basecalling Accelerator (CiMBA), the first embedded ($sim25$mm$^2$) accelerator capable of real-time, on-device basecalling, coupled with AnaLog (AL)-Dorado, a new family of analog focused basecalling DNNs. Our resulting hardware/software co-design greatly reduces data communication overhead, is capable of a throughput of 4.77 million bases per second, 24x that required for real-time operation, and achieves 17x/27x power/area efficiency over the best prior basecalling embedded accelerator while maintaining a high accuracy comparable to state-of-the-art software basecallers.
Problem

Research questions and friction points this paper is trying to address.

Reducing data communication and storage overhead in genome sequencing
Accelerating the bottleneck basecalling step in genome analysis
Enabling real-time on-device basecalling with high efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Compute-in-Memory Basecalling Accelerator (CiMBA)
AnaLog (AL)-Dorado basecalling DNNs
Hardware/software co-design for efficiency
🔎 Similar Papers
No similar papers found.
W
William Andrew Simon
International Business Machines (IBM)
I
I. Boybat
International Business Machines (IBM)
R
Riselda Kodra
Swiss Federal Institute of Technology, Lausanne
E
Elena Ferro
International Business Machines (IBM)
G
Gagandeep Singh
Advanced Micro Devices (AMD)
M
M. Alser
Georgia State University
S
Shubham Jain
International Business Machines (IBM)
H
H. Tsai
International Business Machines (IBM)
Geoffrey W. Burr
Geoffrey W. Burr
IBM Research - Almaden
Neuromorphic computingStorage Class Memory
O
O. Mutlu
ETH, Zurich
A
A. Sebastian
International Business Machines (IBM)