HePGA: A Heterogeneous Processing-in-Memory based GNN Training Accelerator

📅 2025-08-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Heterogeneous in-memory computing (PIM) devices—such as ReRAM, FeFET, PCM, MRAM, and SRAM—exhibit inherent trade-offs in power, latency, area, and non-ideal characteristics (e.g., variability, endurance). To address these limitations for graph neural network (GNN) training, this paper proposes HePGA, the first 3D-integrated heterogeneous multi-core PIM accelerator. Its core innovation lies in dynamically mapping GNN layer-specific computational patterns to the most suitable PIM technology, while jointly leveraging planar and vertical integration for adaptive hardware resource orchestration. This enables high model fidelity without accuracy loss, while significantly improving energy efficiency and compute density. Experimental results demonstrate up to 3.8× higher energy efficiency and 6.8× higher computational efficiency over state-of-the-art PIM accelerators. Moreover, HePGA’s architecture is generalizable, supporting extension to Transformer-based inference workloads.

Technology Category

Application Category

📝 Abstract
Processing-In-Memory (PIM) architectures offer a promising approach to accelerate Graph Neural Network (GNN) training and inference. However, various PIM devices such as ReRAM, FeFET, PCM, MRAM, and SRAM exist, with each device offering unique trade-offs in terms of power, latency, area, and non-idealities. A heterogeneous manycore architecture enabled by 3D integration can combine multiple PIM devices on a single platform, to enable energy-efficient and high-performance GNN training. In this work, we propose a 3D heterogeneous PIM-based accelerator for GNN training referred to as HePGA. We leverage the unique characteristics of GNN layers and associated computing kernels to optimize their mapping on to different PIM devices as well as planar tiers. Our experimental analysis shows that HePGA outperforms existing PIM-based architectures by up to 3.8x and 6.8x in energy-efficiency (TOPS/W) and compute efficiency (TOPS/mm2) respectively, without sacrificing the GNN prediction accuracy. Finally, we demonstrate the applicability of HePGA to accelerate inferencing of emerging transformer models.
Problem

Research questions and friction points this paper is trying to address.

Optimizing GNN training on diverse PIM devices
Mapping GNN layers efficiently across heterogeneous architectures
Balancing performance and energy without accuracy loss
Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous PIM architecture via 3D integration
Optimized mapping across multiple PIM devices
Combines ReRAM FeFET PCM MRAM SRAM technologies
🔎 Similar Papers
No similar papers found.
C
Chukwufumnanya Ogbogu
Washington State University, Pullman, WA, USA
G
Gaurav Narang
Arizona State University, Tempe, AZ, USA
Biresh Kumar Joardar
Biresh Kumar Joardar
University of Houston
Heterogeneous architecturesMachine LearningProcessing-in-MemoryHardware Reliability
J
Janardhan Rao Doppa
Washington State University, Pullman, WA, USA
Krishnendu Chakrabarty
Krishnendu Chakrabarty
Fulton Professor of Microelectronics, School of Electrical and Computer and Energy Engineering
Electronic design automationTesting and Design-for-TestabilityMicrofluidicsComputer EngineeringSensor Networks
P
Partha Pratim Pande
Washington State University, Pullman, WA, USA