Negative Feedback Training: A Novel Concept to Improve Robustness of NVCiM DNN Accelerators

📅 2023-05-23

🏛️ arXiv.org

📈 Citations: 7

✨ Influential: 0

career value

243K/year

🤖 AI Summary

To address the degradation of DNN inference accuracy on non-volatile compute-in-memory (NVCiM) accelerators caused by device-level stochasticity, this paper proposes a novel negative-feedback training (NFT) paradigm grounded in control theory—the first to integrate negative feedback into DNN training. NFT jointly exploits multi-scale intermediate-layer noise signals, instantiating two variants: Output Variance Feedback (OVF) and Intermediate Representation Stabilization (IRS), thereby overcoming the limitation of conventional robust training methods that rely solely on final-output supervision. The approach unifies variational modeling, intermediate-layer representation snapshotting, and noise-aware forward propagation, enabling end-to-end differentiable training. Evaluated on NVCiM hardware, NFT achieves up to 46.71% improvement in inference accuracy, significantly reduces epistemic uncertainty, enhances output confidence, and improves training convergence stability.

📝 Abstract

Compute-in-memory (CIM) accelerators built upon non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference, thanks to their in-situ data processing capability. However, the stochastic nature and intrinsic variations of NVM devices often result in performance degradation in DNN inference. Introducing these non-ideal device behaviors during DNN training enhances robustness, but drawbacks include limited accuracy improvement, reduced prediction confidence, and convergence issues. This arises from a mismatch between the deterministic training and non-deterministic device variations, as such training, though considering variations, relies solely on the model's final output. In this work, we draw inspiration from the control theory and propose a novel training concept: Negative Feedback Training (NFT) leveraging the multi-scale noisy information captured from network. We develop two specific NFT instances, Oriented Variational Forward (OVF) and Intermediate Representation Snapshot (IRS). Extensive experiments show that our methods outperform existing state-of-the-art methods with up to a 46.71% improvement in inference accuracy while reducing epistemic uncertainty, boosting output confidence, and improving convergence probability. Their effectiveness highlights the generality and practicality of our NFT concept in enhancing DNN robustness against device variations.

Problem

Research questions and friction points this paper is trying to address.

Improving robustness of compute-in-memory DNN accelerators against device variations

Addressing performance degradation from non-volatile memory stochastic behavior

Resolving mismatch between deterministic training and non-deterministic device variations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Negative Feedback Training inspired by control theory

Uses OVF and IRS to capture noisy information

Improves robustness against non-volatile memory variations

🔎 Similar Papers

No similar papers found.