🤖 AI Summary
This work addresses the vulnerability of compute-in-memory (CiM) neural accelerators—designed for safety-critical applications—to non-idealities in non-volatile memory devices, such as write variability, conductance drift, and noise, which can cause severe accuracy degradation under worst-case conditions. To bridge the significant gap between average-case performance and worst-case behavior, the authors propose a cross-layer co-design framework incorporating three key techniques: a Selective Write Verification Mechanism (SWIM), a training methodology based on right-censored Gaussian noise to align with hardware-induced variations, and a worst-case-oriented reliability analysis. This approach substantially enhances inference robustness against minor hardware variations that could otherwise trigger catastrophic failures, while preserving the inherent energy efficiency advantages of CiM architectures.
📝 Abstract
Compute-in-memory (CiM) architectures promise significant improvements in energy efficiency and throughput for deep neural network acceleration by alleviating the von Neumann bottleneck. However, their reliance on emerging non-volatile memory devices introduces device-level non-idealities-such as write variability, conductance drift, and stochastic noise-that fundamentally challenge reliability, predictability, and safety, especially in safety-critical applications. This talk examines the reliability limits of CiM-based neural accelerators and presents a series of techniques that bridge device physics, architecture, and learning algorithms to address these challenges. We first demonstrate that even small device variations can lead to disproportionately large accuracy degradation and catastrophic failures in safety-critical inference workloads, revealing a critical gap between average-case evaluations and worst-case behavior. Building on this insight, we introduce SWIM, a selective write-verify mechanism that strategically applies verification only where it is most impactful, significantly improving reliability while maintaining CiM's efficiency advantages. Finally, we explore a learning-centric solution that improves realistic worst-case performance by training neural networks with right-censored Gaussian noise, aligning training assumptions with hardware-induced variability and enabling robust deployment without excessive hardware overhead. Together, these works highlight the necessity of cross-layer co-design for CiM accelerators and provide a principled path toward dependable, efficient neural inference on emerging memory technologies-paving the way for their adoption in safety- and reliability-critical systems.