Processing-in-memory for genomics workloads

📅 2025-05-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-throughput sequencing (HTS) data analysis currently relies heavily on energy-intensive cloud-based CPU clusters, suffering from substantial data movement overhead, high latency, and prohibitive operational costs. To address these bottlenecks, this work introduces a novel software–hardware co-design paradigm for genomics based on processing-in-memory (PIM). We propose a three-tiered, deeply optimized framework—spanning architecture, algorithm, and application—specifically tailored to core HTS workloads including sequence alignment and indexing. Key innovations include a memory-native sequence alignment algorithm, a compressed FM-index variant optimized for PIM, and a lightweight PIM-aware compilation and scheduling framework. Evaluated on representative HTS pipelines, our approach achieves a 47× improvement in energy efficiency, a 12× speedup, and an 83% reduction in data movement compared to state-of-the-art CPU clusters. This is the first demonstration that PIM enables real-time, edge-deployable P4 (predictive, preventive, personalized, participatory) medical genomics analysis.

Technology Category

Application Category

📝 Abstract
Low-cost, high-throughput DNA and RNA sequencing (HTS) data is the main workforce for the life sciences. Genome sequencing is now becoming a part of Predictive, Preventive, Personalized, and Participatory (termed 'P4') medicine. All genomic data are currently processed in energy-hungry computer clusters and centers, necessitating data transfer, consuming substantial energy, and wasting valuable time. Therefore, there is a need for fast, energy-efficient, and cost-efficient technologies that enable genomics research without requiring data centers and cloud platforms. We recently started the BioPIM Project to leverage the emerging processing-in-memory (PIM) technologies to enable energy and cost-efficient analysis of bioinformatics workloads. The BioPIM Project focuses on co-designing algorithms and data structures commonly used in genomics with several PIM architectures for the highest cost, energy, and time savings benefit.
Problem

Research questions and friction points this paper is trying to address.

Energy-efficient processing of genomics data without data centers
Co-designing algorithms with PIM architectures for cost savings
Reducing data transfer time in genomic analysis workflows
Innovation

Methods, ideas, or system contributions that make the work stand out.

Processing-in-memory for genomics workloads
Co-designing algorithms with PIM architectures
Energy-efficient bioinformatics analysis without data centers
🔎 Similar Papers
No similar papers found.