DISCA: A Digital In-memory Stochastic Computing Architecture Using A Compressed Bent-Pyramid Format

๐Ÿ“… 2025-11-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address hardware budget constraints, severe memory bottlenecks, and low energy efficiency of existing in-memory computing (IMC) architectures for deploying large-scale AI models at the edge, this paper proposes a digital in-memory stochastic computing architecture based on the compressed Bent-Pyramid format. The architecture synergistically combines the energy efficiency of analog computation with the scalability and reliability of digital systems, employing quasi-random data encoding to enable high-accuracy, low-overhead matrix operations. Pre-silicon design targeting commercial 180 nm CMOS technology, validated via post-layout modeling, achieves an energy efficiency of 3.59 TOPS/W/bit at 500 MHzโ€”improving upon state-of-the-art digital IMC architectures by one order of magnitude. The key innovation lies in the first-ever co-optimization of compressed Bent-Pyramid encoding and digital in-memory stochastic computing, simultaneously achieving high computational accuracy, superior energy efficiency, and high integration density.

Technology Category

Application Category

๐Ÿ“ Abstract
Nowadays, we are witnessing an Artificial Intelligence revolution that dominates the technology landscape in various application domains, such as healthcare, robotics, automotive, security, and defense. Massive-scale AI models, which mimic the human brain's functionality, typically feature millions and even billions of parameters through data-intensive matrix multiplication tasks. While conventional Von-Neumann architectures struggle with the memory wall and the end of Moore's Law, these AI applications are migrating rapidly towards the edge, such as in robotics and unmanned aerial vehicles for surveillance, thereby adding more constraints to the hardware budget of AI architectures at the edge. Although in-memory computing has been proposed as a promising solution for the memory wall, both analog and digital in-memory computing architectures suffer from substantial degradation of the proposed benefits due to various design limitations. We propose a new digital in-memory stochastic computing architecture, DISCA, utilizing a compressed version of the quasi-stochastic Bent-Pyramid data format. DISCA inherits the same computational simplicity of analog computing, while preserving the same scalability, productivity, and reliability of digital systems. Post-layout modeling results of DISCA show an energy efficiency of 3.59 TOPS/W per bit at 500 MHz using a commercial 180nm CMOS technology. Therefore, DISCA significantly improves the energy efficiency for matrix multiplication workloads by orders of magnitude if scaled and compared to its counterpart architectures.
Problem

Research questions and friction points this paper is trying to address.

Proposes digital in-memory stochastic computing for AI matrix multiplication
Addresses memory wall and hardware constraints in edge AI devices
Improves energy efficiency for large-scale neural network computations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Digital in-memory stochastic computing architecture
Compressed bent-pyramid format for data
Enhanced energy efficiency for matrix multiplication
๐Ÿ”Ž Similar Papers
No similar papers found.