🤖 AI Summary
To address the significant accuracy degradation in Spiking Neural Networks (SNNs) caused by ultra-low-bit weight quantization, this paper proposes Temporal-adaptive Weight Quantization (TaWQ). Inspired by astrocyte-mediated regulation of synaptic plasticity in biological neural systems, TaWQ is the first method to incorporate temporal dynamics into weight quantization—adaptively allocating ultra-low-bit weights along the time dimension and jointly optimizing spiking dynamics and quantization error during training. The approach enables end-to-end quantized training on both static (ImageNet) and neuromorphic datasets. Experiments demonstrate that, on ImageNet, TaWQ incurs only a 0.22% Top-1 accuracy drop while compressing model parameters to 4.12M and reducing single-inference energy consumption to 0.63 mJ. This achieves an unprecedented balance between high accuracy and extreme energy efficiency, establishing a novel paradigm for brain-inspired computing at the edge.
📝 Abstract
Weight quantization in spiking neural networks (SNNs) could further reduce energy consumption. However, quantizing weights without sacrificing accuracy remains challenging. In this study, inspired by astrocyte-mediated synaptic modulation in the biological nervous systems, we propose Temporal-adaptive Weight Quantization (TaWQ), which incorporates weight quantization with temporal dynamics to adaptively allocate ultra-low-bit weights along the temporal dimension. Extensive experiments on static (e.g., ImageNet) and neuromorphic (e.g., CIFAR10-DVS) datasets demonstrate that our TaWQ maintains high energy efficiency (4.12M, 0.63mJ) while incurring a negligible quantization loss of only 0.22% on ImageNet.