Bayes2IMC: In-Memory Computing for Bayesian Binary Neural Networks

📅 2024-11-12

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

Bayes2IMC addresses the high computational overhead of Bayesian binary neural network (BBNN) inference in resource-constrained settings. It proposes an energy-efficient uncertainty modeling framework leveraging in-memory computing (IMC) on phase-change memory (PCM). Methodologically, it is the first to exploit PCM’s intrinsic conductance noise to directly sample Bayesian weight distributions—eliminating the need for external entropy sources and pre-neuron analog-to-digital converters (ADCs). It further introduces a hardware-software co-designed correction mechanism applied solely to output-layer logits and a conductance-drift-resilient compensation algorithm to ensure deployment robustness. Evaluated on CIFAR-10, Bayes2IMC achieves accuracy comparable to ideal floating-point Bayesian inference. It improves total energy efficiency (GOPS/W/mm²) by 3.8–9.6× and power efficiency (GOPS/W) by 2.2–5.6× over conventional digital implementations, outperforming state-of-the-art memristor-based BNN accelerators by 20%.

Technology Category

Application Category

📝 Abstract

Bayesian Neural Networks (BNNs) provide superior estimates of uncertainty by generating an ensemble of predictive distributions. However, inference via ensembling is resource-intensive, requiring additional entropy sources to generate stochasticity which increases resource consumption. We introduce Bayes2IMC, an in-memory computing (IMC) architecture designed for binary Bayesian neural networks that leverage nanoscale device stochasticity to generate desired distributions. Our novel approach utilizes Phase-Change Memory (PCM) to harness inherent noise characteristics, enabling the creation of a binary neural network. This design eliminates the necessity for a pre-neuron Analog-to-Digital Converter (ADC), significantly improving power and area efficiency. We also develop a hardware-software co-optimized correction method applied solely on the logits in the final layer to reduce device-induced accuracy variations across deployments on hardware. Additionally, we devise a simple compensation technique that ensures no drop in classification accuracy despite conductance drift of PCM. We validate the effectiveness of our approach on the CIFAR-10 dataset with a VGGBinaryConnect model, achieving accuracy metrics comparable to ideal software implementations as well as results reported in the literature using other technologies. Finally, we present a complete core architecture and compare its projected power, performance, and area efficiency against an equivalent SRAM baseline, showing a $3.8$ to $9.6 imes$ improvement in total efficiency (in GOPS/W/mm$^2$) and a $2.2 $ to $5.6 imes$ improvement in power efficiency (in GOPS/W). In addition, the projected hardware performance of Bayes2IMC surpasses that of most of the BNN architectures based on memristive devices reported in the literature, and achieves up to $20%$ higher power efficiency compared to the state-of-the-art.

Problem

Research questions and friction points this paper is trying to address.

Enhance Bayesian Neural Networks efficiency

Reduce resource consumption in BNNs

Improve power and area efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Phase-Change Memory for stochasticity

Eliminates pre-neuron ADC

Hardware-software co-optimized correction

🔎 Similar Papers

No similar papers found.