🤖 AI Summary
To address the high memory and computational overhead of Spiking Neural Networks (SNNs) on resource-constrained edge devices, this paper proposes a lightweight Quantized SNN (Q-SNN). Methodologically, it introduces the first trainable quantization scheme for membrane potentials and proposes a Weight–Spike Dual Regularization (WS-DR) mechanism, which jointly regularizes weights and membrane potentials based on information entropy theory; it further integrates quantization-aware training with event-driven sparse computation. In terms of contributions and results, Q-SNN achieves state-of-the-art accuracy on both static and neuromorphic datasets, reduces model size by 5.3×, lowers inference energy consumption by 68%, and even surpasses the full-precision baseline in accuracy. This work establishes a new paradigm for efficient, low-power SNN deployment on resource-limited edge intelligence platforms.
📝 Abstract
Brain-inspired Spiking Neural Networks (SNNs) leverage sparse spikes to represent information and process them in an asynchronous event-driven manner, offering an energy-efficient paradigm for the next generation of machine intelligence. However, the current focus within the SNN community prioritizes accuracy optimization through the development of large-scale models, limiting their viability in resource-constrained and low-power edge devices. To address this challenge, we introduce a lightweight and hardware-friendly Quantized SNN (Q-SNN) that applies quantization to both synaptic weights and membrane potentials. By significantly compressing these two key elements, the proposed Q-SNNs substantially reduce both memory usage and computational complexity. Moreover, to prevent the performance degradation caused by this compression, we present a new Weight-Spike Dual Regulation (WS-DR) method inspired by information entropy theory. Experimental evaluations on various datasets, including static and neuromorphic, demonstrate that our Q-SNNs outperform existing methods in terms of both model size and accuracy. These state-of-the-art results in efficiency and efficacy suggest that the proposed method can significantly improve edge intelligent computing.