🤖 AI Summary
Bayes2IMC addresses the high computational overhead of Bayesian binary neural network (BBNN) inference in resource-constrained settings. It proposes an energy-efficient uncertainty modeling framework leveraging in-memory computing (IMC) on phase-change memory (PCM). Methodologically, it is the first to exploit PCM’s intrinsic conductance noise to directly sample Bayesian weight distributions—eliminating the need for external entropy sources and pre-neuron analog-to-digital converters (ADCs). It further introduces a hardware-software co-designed correction mechanism applied solely to output-layer logits and a conductance-drift-resilient compensation algorithm to ensure deployment robustness. Evaluated on CIFAR-10, Bayes2IMC achieves accuracy comparable to ideal floating-point Bayesian inference. It improves total energy efficiency (GOPS/W/mm²) by 3.8–9.6× and power efficiency (GOPS/W) by 2.2–5.6× over conventional digital implementations, outperforming state-of-the-art memristor-based BNN accelerators by 20%.
📝 Abstract
Bayesian Neural Networks (BNNs) provide superior estimates of uncertainty by generating an ensemble of predictive distributions. However, inference via ensembling is resource-intensive, requiring additional entropy sources to generate stochasticity which increases resource consumption. We introduce Bayes2IMC, an in-memory computing (IMC) architecture designed for binary Bayesian neural networks that leverage nanoscale device stochasticity to generate desired distributions. Our novel approach utilizes Phase-Change Memory (PCM) to harness inherent noise characteristics, enabling the creation of a binary neural network. This design eliminates the necessity for a pre-neuron Analog-to-Digital Converter (ADC), significantly improving power and area efficiency. We also develop a hardware-software co-optimized correction method applied solely on the logits in the final layer to reduce device-induced accuracy variations across deployments on hardware. Additionally, we devise a simple compensation technique that ensures no drop in classification accuracy despite conductance drift of PCM. We validate the effectiveness of our approach on the CIFAR-10 dataset with a VGGBinaryConnect model, achieving accuracy metrics comparable to ideal software implementations as well as results reported in the literature using other technologies. Finally, we present a complete core architecture and compare its projected power, performance, and area efficiency against an equivalent SRAM baseline, showing a $3.8$ to $9.6 imes$ improvement in total efficiency (in GOPS/W/mm$^2$) and a $2.2 $ to $5.6 imes$ improvement in power efficiency (in GOPS/W). In addition, the projected hardware performance of Bayes2IMC surpasses that of most of the BNN architectures based on memristive devices reported in the literature, and achieves up to $20%$ higher power efficiency compared to the state-of-the-art.