🤖 AI Summary
To address the challenge of simultaneously achieving low energy consumption and high accuracy in stochastic computing neural networks (SC-NNs) deployed on resource-constrained IoT devices, this paper introduces, for the first time, a mixed-precision paradigm into SC-NNs, proposing an Adjustable Sequence Length (ASL) optimization framework across layers. Methodologically, we establish a noise propagation theoretical model based on operator norms and jointly design coarse- and fine-grained truncation strategies via multi-layer truncation sensitivity analysis and random forest regression prediction. Evaluated on a 32-nm pipelined SC-MLP, our approach reduces energy consumption and latency by over 60%, with negligible accuracy degradation. Key contributions include: (1) the first systematic layer-wise mixed-precision methodology tailored for SC-NNs; (2) a novel paradigm for controllable noise propagation modeling and quantitative analysis of truncation effects; and (3) significant improvement in energy efficiency for edge AI in IoT applications.
📝 Abstract
Stochastic computing (SC) has emerged as an efficient low-power alternative for deploying neural networks (NNs) in resource-limited scenarios, such as the Internet of Things (IoT). By encoding values as serial bitstreams, SC significantly reduces energy dissipation compared to conventional floating-point (FP) designs; however, further improvement of layer-wise mixed-precision implementation for SC remains unexplored. This article introduces Adjustable Sequence Length (ASL), a novel scheme that applies mixed-precision concepts specifically to SC NNs. By introducing an operator-norm-based theoretical model, this article shows that truncation noise can cumulatively propagate through the layers by the estimated amplification factors. An extended sensitivity analysis is presented, using random forest (RF) regression to evaluate multilayer truncation effects and validate the alignment of theoretical predictions with practical network behaviors. To accommodate different application scenarios, this article proposes two truncation strategies (coarse-grained and fine-grained), which apply diverse sequence length configurations at each layer. Evaluations on a pipelined SC MLP synthesized at 32nm demonstrate that ASL can reduce energy and latency overheads by up to over 60% with negligible accuracy loss. It confirms the feasibility of the ASL scheme for IoT applications and highlights the distinct advantages of mixed-precision truncation in SC designs.