Bayes2IMC: In-Memory Computing for Bayesian Binary Neural Networks

📅 2024-11-12
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Bayes2IMC addresses the high computational overhead of Bayesian binary neural network (BBNN) inference in resource-constrained settings. It proposes an energy-efficient uncertainty modeling framework leveraging in-memory computing (IMC) on phase-change memory (PCM). Methodologically, it is the first to exploit PCM’s intrinsic conductance noise to directly sample Bayesian weight distributions—eliminating the need for external entropy sources and pre-neuron analog-to-digital converters (ADCs). It further introduces a hardware-software co-designed correction mechanism applied solely to output-layer logits and a conductance-drift-resilient compensation algorithm to ensure deployment robustness. Evaluated on CIFAR-10, Bayes2IMC achieves accuracy comparable to ideal floating-point Bayesian inference. It improves total energy efficiency (GOPS/W/mm²) by 3.8–9.6× and power efficiency (GOPS/W) by 2.2–5.6× over conventional digital implementations, outperforming state-of-the-art memristor-based BNN accelerators by 20%.

Technology Category

Application Category

📝 Abstract
Bayesian Neural Networks (BNNs) provide superior estimates of uncertainty by generating an ensemble of predictive distributions. However, inference via ensembling is resource-intensive, requiring additional entropy sources to generate stochasticity which increases resource consumption. We introduce Bayes2IMC, an in-memory computing (IMC) architecture designed for binary Bayesian neural networks that leverage nanoscale device stochasticity to generate desired distributions. Our novel approach utilizes Phase-Change Memory (PCM) to harness inherent noise characteristics, enabling the creation of a binary neural network. This design eliminates the necessity for a pre-neuron Analog-to-Digital Converter (ADC), significantly improving power and area efficiency. We also develop a hardware-software co-optimized correction method applied solely on the logits in the final layer to reduce device-induced accuracy variations across deployments on hardware. Additionally, we devise a simple compensation technique that ensures no drop in classification accuracy despite conductance drift of PCM. We validate the effectiveness of our approach on the CIFAR-10 dataset with a VGGBinaryConnect model, achieving accuracy metrics comparable to ideal software implementations as well as results reported in the literature using other technologies. Finally, we present a complete core architecture and compare its projected power, performance, and area efficiency against an equivalent SRAM baseline, showing a $3.8$ to $9.6 imes$ improvement in total efficiency (in GOPS/W/mm$^2$) and a $2.2 $ to $5.6 imes$ improvement in power efficiency (in GOPS/W). In addition, the projected hardware performance of Bayes2IMC surpasses that of most of the BNN architectures based on memristive devices reported in the literature, and achieves up to $20%$ higher power efficiency compared to the state-of-the-art.
Problem

Research questions and friction points this paper is trying to address.

Enhance Bayesian Neural Networks efficiency
Reduce resource consumption in BNNs
Improve power and area efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Phase-Change Memory for stochasticity
Eliminates pre-neuron ADC
Hardware-software co-optimized correction
🔎 Similar Papers
No similar papers found.
P
Prabodh Katti
Centre for Intelligent Information Processing Systems (CIIPS), Department of Engineering, King’s College London, London WC2R 2LS, U.K.
C
Clement Ruah
Centre for Intelligent Information Processing Systems (CIIPS), Department of Engineering, King’s College London, London WC2R 2LS, U.K.
Osvaldo Simeone
Osvaldo Simeone
King's College London
Information theorymachine learningquantum information processingwireless systems
Bashir M. Al-Hashimi
Bashir M. Al-Hashimi
Professor of Computer Engineering, King’s College London
Energy-efficient computingembedded systemsdesign for test
Bipin Rajendran
Bipin Rajendran
Professor of Intelligent Computing Systems at King's College London
Nanoscale logic and memory devicesneuromorphic computation