Enhanced Self-Distillation Framework for Efficient Spiking Neural Network Training

📅 2025-10-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitations of backpropagation through time (BPTT) and surrogate gradient methods for spiking neural network (SNN) training—including suboptimal accuracy, high temporal computational overhead, and excessive memory consumption—this paper proposes an enhanced self-distillation framework. Methodologically, it introduces: (1) a lightweight artificial neural network (ANN) branch that takes intermediate-layer spike rates of the SNN as input, enabling cross-modal knowledge transfer; (2) the first decomposition of teacher signals into reliable and unreliable components, where only the reliable component guides SNN optimization to improve convergence stability; and (3) the integration of rate-based backpropagation with self-distillation, eliminating temporal unrolling and gradient truncation. Evaluated on CIFAR-10/100, CIFAR10-DVS, and ImageNet, the method significantly reduces training complexity while surpassing state-of-the-art SNN training approaches in accuracy, validating the efficacy of this efficient co-optimization paradigm.

Technology Category

Application Category

📝 Abstract
Spiking Neural Networks (SNNs) exhibit exceptional energy efficiency on neuromorphic hardware due to their sparse activation patterns. However, conventional training methods based on surrogate gradients and Backpropagation Through Time (BPTT) not only lag behind Artificial Neural Networks (ANNs) in performance, but also incur significant computational and memory overheads that grow linearly with the temporal dimension. To enable high-performance SNN training under limited computational resources, we propose an enhanced self-distillation framework, jointly optimized with rate-based backpropagation. Specifically, the firing rates of intermediate SNN layers are projected onto lightweight ANN branches, and high-quality knowledge generated by the model itself is used to optimize substructures through the ANN pathways. Unlike traditional self-distillation paradigms, we observe that low-quality self-generated knowledge may hinder convergence. To address this, we decouple the teacher signal into reliable and unreliable components, ensuring that only reliable knowledge is used to guide the optimization of the model. Extensive experiments on CIFAR-10, CIFAR-100, CIFAR10-DVS, and ImageNet demonstrate that our method reduces training complexity while achieving high-performance SNN training. Our code is available at https://github.com/Intelli-Chip-Lab/enhanced-self-distillation-framework-for-snn.
Problem

Research questions and friction points this paper is trying to address.

Improving SNN training efficiency under limited computational resources
Reducing computational and memory overheads in temporal SNN training
Addressing unreliable self-generated knowledge hindering SNN convergence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced self-distillation framework with rate-based backpropagation
Projects SNN firing rates onto lightweight ANN branches
Decouples teacher signals into reliable and unreliable components
🔎 Similar Papers
No similar papers found.
Xiaochen Zhao
Xiaochen Zhao
Research Scientist, ByteDance
computer science
Chengting Yu
Chengting Yu
Zhejiang University
Kairong Yu
Kairong Yu
Zhejiang University
Computer VisionMultimodal LearningSpiking Neural Network
L
Lei Liu
ZJU-UIUC Institute, Zhejiang University
A
Aili Wang
ZJU-UIUC Institute, Zhejiang University; College of Information Science and Electronic Engineering, Zhejiang University