Memory-Free and Parallel Computation for Quantized Spiking Neural Networks

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

career value

250K/year

🤖 AI Summary

Quantized Spiking Neural Networks (QSNNs) suffer significant performance degradation on edge devices due to historical membrane potential information loss induced by low-bit membrane potential quantization. To address this, we propose a memoryless quantization scheme that implicitly preserves the full temporal evolution of membrane potentials. We further design a parallel temporal encoding training paradigm coupled with an asynchronous spike inference framework, enabling joint optimization of low-bit weights and activations for synergistic accuracy–efficiency trade-offs. Our approach achieves state-of-the-art accuracy across multiple static and neuromorphic image benchmarks, reduces memory footprint by 42%, accelerates training by 3.1×, and substantially improves energy efficiency. This work constitutes the first systematic solution to quantization-induced historical information decay in QSNNs, establishing a new paradigm for low-power neuromorphic computing at the edge.

Technology Category

Application Category

📝 Abstract

Quantized Spiking Neural Networks (QSNNs) offer superior energy efficiency and are well-suited for deployment on resource-limited edge devices. However, limited bit-width weight and membrane potential result in a notable performance decline. In this study, we first identify a new underlying cause for this decline: the loss of historical information due to the quantized membrane potential. To tackle this issue, we introduce a memory-free quantization method that captures all historical information without directly storing membrane potentials, resulting in better performance with less memory requirements. To further improve the computational efficiency, we propose a parallel training and asynchronous inference framework that greatly increases training speed and energy efficiency. We combine the proposed memory-free quantization and parallel computation methods to develop a high-performance and efficient QSNN, named MFP-QSNN. Extensive experiments show that our MFP-QSNN achieves state-of-the-art performance on various static and neuromorphic image datasets, requiring less memory and faster training speeds. The efficiency and efficacy of the MFP-QSNN highlight its potential for energy-efficient neuromorphic computing.

Problem

Research questions and friction points this paper is trying to address.

Address performance decline in Quantized Spiking Neural Networks due to limited bit-width.

Propose memory-free quantization to capture historical information without storing membrane potentials.

Develop parallel training and asynchronous inference to enhance computational efficiency and energy savings.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory-free quantization captures historical information

Parallel training and asynchronous inference framework

High-performance QSNN with less memory, faster training

🔎 Similar Papers

Time-independent Spiking Neuron via Membrane Potential Estimation for Efficient Spiking Neural Networks