QP-SNN: Quantized and Pruned Spiking Neural Networks

📅 2025-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the conflicting challenges of large model size, high resource overhead, and low energy efficiency in deploying Spiking Neural Networks (SNNs) on edge devices, this paper proposes a lightweight, hardware-friendly co-optimization framework for SNNs. Methodologically, it introduces a novel weight rescaling strategy to enhance representational capacity under low-bit quantization and designs a structured pruning criterion based on spatiotemporal spike activity singular values to precisely remove redundant convolutional kernels. The framework preserves the event-driven nature of SNNs while significantly reducing memory footprint and computational cost. Experiments across multiple benchmark datasets demonstrate state-of-the-art accuracy–efficiency trade-offs on edge platforms. The proposed approach establishes a new paradigm for high-performance SNN deployment under severe resource constraints.

Technology Category

Application Category

📝 Abstract
Brain-inspired Spiking Neural Networks (SNNs) leverage sparse spikes to encode information and operate in an asynchronous event-driven manner, offering a highly energy-efficient paradigm for machine intelligence. However, the current SNN community focuses primarily on performance improvement by developing large-scale models, which limits the applicability of SNNs in resource-limited edge devices. In this paper, we propose a hardware-friendly and lightweight SNN, aimed at effectively deploying high-performance SNN in resource-limited scenarios. Specifically, we first develop a baseline model that integrates uniform quantization and structured pruning, called QP-SNN baseline. While this baseline significantly reduces storage demands and computational costs, it suffers from performance decline. To address this, we conduct an in-depth analysis of the challenges in quantization and pruning that lead to performance degradation and propose solutions to enhance the baseline's performance. For weight quantization, we propose a weight rescaling strategy that utilizes bit width more effectively to enhance the model's representation capability. For structured pruning, we propose a novel pruning criterion using the singular value of spatiotemporal spike activities to enable more accurate removal of redundant kernels. Extensive experiments demonstrate that integrating two proposed methods into the baseline allows QP-SNN to achieve state-of-the-art performance and efficiency, underscoring its potential for enhancing SNN deployment in edge intelligence computing.
Problem

Research questions and friction points this paper is trying to address.

Optimizing SNNs for edge devices
Reducing SNN storage and computational costs
Enhancing SNN performance with quantization and pruning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantized and Pruned SNN
Weight Rescaling Strategy
Singular Value Pruning Criterion
🔎 Similar Papers
No similar papers found.