Temporal Separation with Entropy Regularization for Knowledge Distillation in Spiking Neural Networks

📅 2025-03-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large performance gaps persist between spiking neural networks (SNNs) and artificial neural networks (ANNs), exacerbated by existing knowledge distillation (KD) methods that neglect the intrinsic spatiotemporal dynamics of SNNs. To address this, we propose a temporally decoupled logit-level KD framework. Our method explicitly decomposes the distillation process across timesteps at the logit level and introduces class-probability entropy regularization to stabilize optimization and enhance temporal representation robustness. Unlike conventional approaches that aggregate outputs over time, our framework achieves fine-grained timestep-wise logit alignment—uniquely unlocking the full temporal expressive capacity of SNNs. Evaluated on multiple benchmark datasets, the proposed method consistently outperforms state-of-the-art logit-, feature-, and hybrid-based KD approaches, significantly improving classification accuracy while preserving the SNN’s inherent energy efficiency.

Technology Category

Application Category

📝 Abstract
Spiking Neural Networks (SNNs), inspired by the human brain, offer significant computational efficiency through discrete spike-based information transfer. Despite their potential to reduce inference energy consumption, a performance gap persists between SNNs and Artificial Neural Networks (ANNs), primarily due to current training methods and inherent model limitations. While recent research has aimed to enhance SNN learning by employing knowledge distillation (KD) from ANN teacher networks, traditional distillation techniques often overlook the distinctive spatiotemporal properties of SNNs, thus failing to fully leverage their advantages. To overcome these challenge, we propose a novel logit distillation method characterized by temporal separation and entropy regularization. This approach improves existing SNN distillation techniques by performing distillation learning on logits across different time steps, rather than merely on aggregated output features. Furthermore, the integration of entropy regularization stabilizes model optimization and further boosts the performance. Extensive experimental results indicate that our method surpasses prior SNN distillation strategies, whether based on logit distillation, feature distillation, or a combination of both. The code will be available on GitHub.
Problem

Research questions and friction points this paper is trying to address.

Addresses performance gap between SNNs and ANNs
Improves knowledge distillation for SNNs using temporal separation
Enhances SNN training with entropy regularization for stability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal separation enhances logit distillation.
Entropy regularization stabilizes model optimization.
Distillation across time steps improves SNN performance.
🔎 Similar Papers
No similar papers found.
Kairong Yu
Kairong Yu
Zhejiang University
Computer VisionMultimodal LearningSpiking Neural Network
Chengting Yu
Chengting Yu
Zhejiang University
T
Tianqing Zhang
Zhejiang University
Xiaochen Zhao
Xiaochen Zhao
Research Scientist, ByteDance
computer science
S
Shu Yang
Zhejiang University
H
Hongwei Wang
Zhejiang University
Q
Qiang Zhang
Dalian University of Technology
Q
Qi Xu
Dalian University of Technology