HePGA: A Heterogeneous Processing-in-Memory based GNN Training Accelerator

📅 2025-08-21

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

Heterogeneous in-memory computing (PIM) devices—such as ReRAM, FeFET, PCM, MRAM, and SRAM—exhibit inherent trade-offs in power, latency, area, and non-ideal characteristics (e.g., variability, endurance). To address these limitations for graph neural network (GNN) training, this paper proposes HePGA, the first 3D-integrated heterogeneous multi-core PIM accelerator. Its core innovation lies in dynamically mapping GNN layer-specific computational patterns to the most suitable PIM technology, while jointly leveraging planar and vertical integration for adaptive hardware resource orchestration. This enables high model fidelity without accuracy loss, while significantly improving energy efficiency and compute density. Experimental results demonstrate up to 3.8× higher energy efficiency and 6.8× higher computational efficiency over state-of-the-art PIM accelerators. Moreover, HePGA’s architecture is generalizable, supporting extension to Transformer-based inference workloads.

Technology Category

Application Category

📝 Abstract

Processing-In-Memory (PIM) architectures offer a promising approach to accelerate Graph Neural Network (GNN) training and inference. However, various PIM devices such as ReRAM, FeFET, PCM, MRAM, and SRAM exist, with each device offering unique trade-offs in terms of power, latency, area, and non-idealities. A heterogeneous manycore architecture enabled by 3D integration can combine multiple PIM devices on a single platform, to enable energy-efficient and high-performance GNN training. In this work, we propose a 3D heterogeneous PIM-based accelerator for GNN training referred to as HePGA. We leverage the unique characteristics of GNN layers and associated computing kernels to optimize their mapping on to different PIM devices as well as planar tiers. Our experimental analysis shows that HePGA outperforms existing PIM-based architectures by up to 3.8x and 6.8x in energy-efficiency (TOPS/W) and compute efficiency (TOPS/mm2) respectively, without sacrificing the GNN prediction accuracy. Finally, we demonstrate the applicability of HePGA to accelerate inferencing of emerging transformer models.

Problem

Research questions and friction points this paper is trying to address.

Optimizing GNN training on diverse PIM devices

Mapping GNN layers efficiently across heterogeneous architectures

Balancing performance and energy without accuracy loss

Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous PIM architecture via 3D integration

Optimized mapping across multiple PIM devices

Combines ReRAM FeFET PCM MRAM SRAM technologies

🔎 Similar Papers

PyGim : An Efficient Graph Neural Network Library for Real Processing-In-Memory Architectures