🤖 AI Summary
Training large-scale Spiking Neural Networks (SNNs) incurs prohibitive computational overhead, and existing data pruning methods—designed for Artificial Neural Networks (ANNs)—exhibit biased importance estimation and high gradient variance in SNNs due to their neglect of spiking dynamics.
Method: This work introduces data pruning into SNN training for the first time, proposing a spike-aware importance score mechanism that jointly incorporates gradient norm and spiking activity characteristics. It enables efficient, low-variance sample importance estimation without real-time gradient computation and is tightly integrated with SNN backpropagation for co-optimization.
Contribution/Results: On ImageNet, our method achieves 35% training speedup with zero accuracy degradation, approaching the theoretical optimal acceleration ratio. Extensive generalization experiments confirm its effectiveness across diverse SNN architectures and datasets. This work establishes a novel paradigm for resource-efficient SNN training.
📝 Abstract
Spiking neural networks (SNNs), recognized as an energy-efficient alternative to traditional artificial neural networks (ANNs), have advanced rapidly through the scaling of models and datasets. However, such scaling incurs considerable training overhead, posing challenges for researchers with limited computational resources and hindering the sustained development of SNNs. Data pruning is a promising strategy for accelerating training by retaining the most informative examples and discarding redundant ones, but it remains largely unexplored in SNNs. Directly applying ANN-based data pruning methods to SNNs fails to capture the intrinsic importance of examples and suffers from high gradient variance. To address these challenges, we propose a novel spike-aware data pruning (SADP) method. SADP reduces gradient variance by determining each example's selection probability to be proportional to its gradient norm, while avoiding the high cost of direct gradient computation through an efficient upper bound, termed spike-aware importance score. This score accounts for the influence of all-or-nothing spikes on the gradient norm and can be computed with negligible overhead. Extensive experiments across diverse datasets and architectures demonstrate that SADP consistently outperforms data pruning baselines and achieves training speedups close to the theoretical maxima at different pruning ratios. Notably, SADP reduces training time by 35% on ImageNet while maintaining accuracy comparable to that of full-data training. This work, therefore, establishes a data-centric paradigm for efficient SNN training and paves the way for scaling SNNs to larger models and datasets. The source code will be released publicly after the review process.