🤖 AI Summary
Quantized Spiking Neural Networks (QSNNs) suffer significant performance degradation on edge devices due to historical membrane potential information loss induced by low-bit membrane potential quantization. To address this, we propose a memoryless quantization scheme that implicitly preserves the full temporal evolution of membrane potentials. We further design a parallel temporal encoding training paradigm coupled with an asynchronous spike inference framework, enabling joint optimization of low-bit weights and activations for synergistic accuracy–efficiency trade-offs. Our approach achieves state-of-the-art accuracy across multiple static and neuromorphic image benchmarks, reduces memory footprint by 42%, accelerates training by 3.1×, and substantially improves energy efficiency. This work constitutes the first systematic solution to quantization-induced historical information decay in QSNNs, establishing a new paradigm for low-power neuromorphic computing at the edge.
📝 Abstract
Quantized Spiking Neural Networks (QSNNs) offer superior energy efficiency and are well-suited for deployment on resource-limited edge devices. However, limited bit-width weight and membrane potential result in a notable performance decline. In this study, we first identify a new underlying cause for this decline: the loss of historical information due to the quantized membrane potential. To tackle this issue, we introduce a memory-free quantization method that captures all historical information without directly storing membrane potentials, resulting in better performance with less memory requirements. To further improve the computational efficiency, we propose a parallel training and asynchronous inference framework that greatly increases training speed and energy efficiency. We combine the proposed memory-free quantization and parallel computation methods to develop a high-performance and efficient QSNN, named MFP-QSNN. Extensive experiments show that our MFP-QSNN achieves state-of-the-art performance on various static and neuromorphic image datasets, requiring less memory and faster training speeds. The efficiency and efficacy of the MFP-QSNN highlight its potential for energy-efficient neuromorphic computing.