Negative Feedback Training: A Novel Concept to Improve Robustness of NVCiM DNN Accelerators

πŸ“… 2023-05-23
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 7
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the degradation of DNN inference accuracy on non-volatile compute-in-memory (NVCiM) accelerators caused by device-level stochasticity, this paper proposes a novel negative-feedback training (NFT) paradigm grounded in control theoryβ€”the first to integrate negative feedback into DNN training. NFT jointly exploits multi-scale intermediate-layer noise signals, instantiating two variants: Output Variance Feedback (OVF) and Intermediate Representation Stabilization (IRS), thereby overcoming the limitation of conventional robust training methods that rely solely on final-output supervision. The approach unifies variational modeling, intermediate-layer representation snapshotting, and noise-aware forward propagation, enabling end-to-end differentiable training. Evaluated on NVCiM hardware, NFT achieves up to 46.71% improvement in inference accuracy, significantly reduces epistemic uncertainty, enhances output confidence, and improves training convergence stability.
πŸ“ Abstract
Compute-in-memory (CIM) accelerators built upon non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference, thanks to their in-situ data processing capability. However, the stochastic nature and intrinsic variations of NVM devices often result in performance degradation in DNN inference. Introducing these non-ideal device behaviors during DNN training enhances robustness, but drawbacks include limited accuracy improvement, reduced prediction confidence, and convergence issues. This arises from a mismatch between the deterministic training and non-deterministic device variations, as such training, though considering variations, relies solely on the model's final output. In this work, we draw inspiration from the control theory and propose a novel training concept: Negative Feedback Training (NFT) leveraging the multi-scale noisy information captured from network. We develop two specific NFT instances, Oriented Variational Forward (OVF) and Intermediate Representation Snapshot (IRS). Extensive experiments show that our methods outperform existing state-of-the-art methods with up to a 46.71% improvement in inference accuracy while reducing epistemic uncertainty, boosting output confidence, and improving convergence probability. Their effectiveness highlights the generality and practicality of our NFT concept in enhancing DNN robustness against device variations.
Problem

Research questions and friction points this paper is trying to address.

Improving robustness of compute-in-memory DNN accelerators against device variations
Addressing performance degradation from non-volatile memory stochastic behavior
Resolving mismatch between deterministic training and non-deterministic device variations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Negative Feedback Training inspired by control theory
Uses OVF and IRS to capture noisy information
Improves robustness against non-volatile memory variations
πŸ”Ž Similar Papers
No similar papers found.