🤖 AI Summary
To address the low energy efficiency of Artificial Neural Networks (ANNs) and the limited accuracy of Spiking Neural Networks (SNNs) in sparse edge computing, this paper proposes a column-level fine-grained ANN-SNN co-execution accelerator. Our approach tackles three key challenges: (1) a novel integer-precision-preserving layer-to-column mapping technique enabling hybrid ANN/SNN deployment at the column granularity; (2) an offline cost-model-driven scheduler that jointly optimizes dynamic spike generation, workload packing, and unified INT8 memory management to guarantee deterministic runtime behavior; and (3) hardware-software co-optimization achieving competitive accuracy while significantly reducing Energy-Delay Product (EDP). Experimental results demonstrate that, compared to a pure-ANN baseline, our design reduces EDP by 57–67% and improves throughput by 16–19%; it further achieves 2.5× speedup over LoAS and 2.51× energy savings versus SparTen.
📝 Abstract
NeuroFlex is a column-level accelerator that co-executes artificial and spiking neural networks to minimize energy-delay product on sparse edge workloads with competitive accuracy. The design extends integer-exact QCFS ANN-SNN conversion from layers to independent columns. It unifies INT8 storage with on-the-fly spike generation using an offline cost model to assign columns to ANN or SNN cores and pack work across processing elements with deterministic runtime. Our cost-guided scheduling algorithm improves throughput by 16-19% over random mapping and lowers EDP by 57-67% versus a strong ANN-only baseline across VGG-16, ResNet-34, GoogLeNet, and BERT models. NeuroFlex also delivers up to 2.5x speedup over LoAS and 2.51x energy reduction over SparTen. These results indicate that fine-grained and integer-exact hybridization outperforms single-mode designs on energy and latency without sacrificing accuracy.