Efficient SRAM-PIM Co-design by Joint Exploration of Value-Level and Bit-Level Sparsity

📅 2025-05-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Digital SRAM-based Processing-in-Memory (PIM) architectures struggle to jointly exploit value-level and bit-level sparsity, while static zero-value/zero-bit handling incurs significant energy overhead. To address this, we propose Dyadic Block PIM (DB-PIM), an algorithm-architecture co-design framework. DB-PIM introduces a novel joint pruning strategy unifying value- and bit-level sparsity, coupled with a dynamic block computation paradigm, mixed-granularity pruning, a Canonical Signed Digit (CSD) adder tree, an input preprocessing unit (IPU), a dyadic block multiplication unit (DBMU), and a custom sparse SRAM-PIM macro. Unlike conventional crossbar-based PIMs constrained by static zero-value masking, DB-PIM enables adaptive, fine-grained sparsity exploitation. Evaluated on benchmark workloads, it achieves up to 8.01× speedup and 85.28% energy reduction without accuracy loss, significantly enhancing both energy efficiency and computational flexibility of digital PIM systems.

Technology Category

Application Category

📝 Abstract
Processing-in-memory (PIM) is a transformative architectural paradigm designed to overcome the Von Neumann bottleneck. Among PIM architectures, digital SRAM-PIM emerges as a promising solution, offering significant advantages by directly integrating digital logic within the SRAM array. However, rigid crossbar architecture and full array activation pose challenges in efficiently utilizing traditional value-level sparsity. Moreover, neural network models exhibit a high proportion of zero bits within non-zero values, which remain underutilized due to architectural constraints. To overcome these limitations, we present Dyadic Block PIM (DB-PIM), a groundbreaking algorithm-architecture co-design framework to harness both value-level and bit-level sparsity. At the algorithm level, our hybrid-grained pruning technique, combined with a novel sparsity pattern, enables effective sparsity management. Architecturally, DB-PIM incorporates a sparse network and customized digital SRAM-PIM macros, including input pre-processing unit (IPU), dyadic block multiply units (DBMUs), and Canonical Signed Digit (CSD)-based adder trees. It circumvents structured zero values in weights and bypasses unstructured zero bits within non-zero weights and block-wise all-zero bit columns in input features. As a result, the DB-PIM framework skips a majority of unnecessary computations, thereby driving significant gains in computational efficiency. Results demonstrate that our DB-PIM framework achieves up to 8.01x speedup and 85.28% energy savings, significantly boosting computational efficiency in digital SRAM-PIM systems.
Problem

Research questions and friction points this paper is trying to address.

Overcoming Von Neumann bottleneck with SRAM-PIM co-design
Utilizing value-level and bit-level sparsity in neural networks
Enhancing computational efficiency via algorithm-architecture optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Jointly exploits value-level and bit-level sparsity
Introduces hybrid-grained pruning and sparsity pattern
Custom SRAM-PIM macros with IPU and DBMUs
🔎 Similar Papers
No similar papers found.
C
Cenlin Duan
Fert Beijing Research Institute, School of Integrated Circuit Science and Engineering, Beihang University, Beijing, 100191, China
Jianlei Yang
Jianlei Yang
Beihang University
Deep LearningComputer ArchitectureNueromorphic ComputingSpitronicsEDA/VLSI
Yikun Wang
Yikun Wang
fudan university
Computer vision | Natural language processing
Y
Yiou Wang
School of Computer Science and Engineering, Beihang University, Beijing 100191, China, and Qingdao Research Institute, Beihang University, Qingdao 266104, China
Y
Yingjie Qi
School of Computer Science and Engineering, Beihang University, Beijing 100191, China, and Qingdao Research Institute, Beihang University, Qingdao 266104, China
X
Xiaolin He
School of Computer Science and Engineering, Beihang University, Beijing 100191, China, and Qingdao Research Institute, Beihang University, Qingdao 266104, China
Bonan Yan
Bonan Yan
Assistant Professor, Peking University
Brain-Inspired ComputingEmerging MemoriesAGI Processor
X
Xueyan Wang
Fert Beijing Research Institute, School of Integrated Circuit Science and Engineering, Beihang University, Beijing, 100191, China
X
Xiaotao Jia
Fert Beijing Research Institute, School of Integrated Circuit Science and Engineering, Beihang University, Beijing, 100191, China
Weisheng Zhao
Weisheng Zhao
Fert Beijing Institute, Beihang University
Spintronics Devices and Integrated Circuits