Memory-Free and Parallel Computation for Quantized Spiking Neural Networks

📅 2025-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Quantized Spiking Neural Networks (QSNNs) suffer significant performance degradation on edge devices due to historical membrane potential information loss induced by low-bit membrane potential quantization. To address this, we propose a memoryless quantization scheme that implicitly preserves the full temporal evolution of membrane potentials. We further design a parallel temporal encoding training paradigm coupled with an asynchronous spike inference framework, enabling joint optimization of low-bit weights and activations for synergistic accuracy–efficiency trade-offs. Our approach achieves state-of-the-art accuracy across multiple static and neuromorphic image benchmarks, reduces memory footprint by 42%, accelerates training by 3.1×, and substantially improves energy efficiency. This work constitutes the first systematic solution to quantization-induced historical information decay in QSNNs, establishing a new paradigm for low-power neuromorphic computing at the edge.

Technology Category

Application Category

📝 Abstract
Quantized Spiking Neural Networks (QSNNs) offer superior energy efficiency and are well-suited for deployment on resource-limited edge devices. However, limited bit-width weight and membrane potential result in a notable performance decline. In this study, we first identify a new underlying cause for this decline: the loss of historical information due to the quantized membrane potential. To tackle this issue, we introduce a memory-free quantization method that captures all historical information without directly storing membrane potentials, resulting in better performance with less memory requirements. To further improve the computational efficiency, we propose a parallel training and asynchronous inference framework that greatly increases training speed and energy efficiency. We combine the proposed memory-free quantization and parallel computation methods to develop a high-performance and efficient QSNN, named MFP-QSNN. Extensive experiments show that our MFP-QSNN achieves state-of-the-art performance on various static and neuromorphic image datasets, requiring less memory and faster training speeds. The efficiency and efficacy of the MFP-QSNN highlight its potential for energy-efficient neuromorphic computing.
Problem

Research questions and friction points this paper is trying to address.

Address performance decline in Quantized Spiking Neural Networks due to limited bit-width.
Propose memory-free quantization to capture historical information without storing membrane potentials.
Develop parallel training and asynchronous inference to enhance computational efficiency and energy savings.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory-free quantization captures historical information
Parallel training and asynchronous inference framework
High-performance QSNN with less memory, faster training
🔎 Similar Papers
No similar papers found.
Dehao Zhang
Dehao Zhang
University of Electronic Science and Technology of China
Spiking Neural Network
S
Shuai Wang
School of Computer Science and Engineering, University of Electronic Science and Technology of China
Y
Yichen Xiao
School of Computer Science and Engineering, University of Electronic Science and Technology of China
Wenjie Wei
Wenjie Wei
University of Electronic Science and Technology of China
Spiking Neural NetworkNeuromorphic ComputingModel CompressionEvent-based Vision
Yimeng Shan
Yimeng Shan
Liaoning technical university
Spiking Neural NetworksNeuromorphic VisionSingle Object TrackingEvent Camera
M
Malu Zhang
School of Computer Science and Engineering, University of Electronic Science and Technology of China
Y
Yang Yang
School of Computer Science and Engineering, University of Electronic Science and Technology of China