🤖 AI Summary
This work addresses the “dead neuron” problem in spiking neural networks (SNNs) caused by fixed neuronal thresholds, which impedes training convergence. We present the first systematic formulation and implementation of end-to-end jointly learnable neuronal thresholds and synaptic weights. Within a backpropagation-based gradient optimization framework, we promote the threshold from a hyperparameter to a trainable parameter, enabling synchronous update of thresholds and weights during training. The proposed method significantly enhances training robustness and accelerates convergence: on benchmark spatio-temporal datasets—including NMNIST, DVS128 Gesture, and Spiking Heidelberg Digits (SHD)—it achieves up to 30% reduction in training time and a 2% absolute accuracy gain, while substantially decreasing the number of epochs required to reach effective performance. This establishes a novel paradigm for efficient SNN training on neuromorphic computing platforms.
📝 Abstract
Neuromorphic computing has recently gained momentum with the emergence of various neuromorphic processors. As the field advances, there is an increasing focus on developing training methods that can effectively leverage the unique properties of spiking neural networks (SNNs). SNNs emulate the temporal dynamics of biological neurons, making them particularly well-suited for real-time, event-driven processing. To fully harness the potential of SNNs across different neuromorphic platforms, effective training methodologies are essential. In SNNs, learning rules are based on neurons' spiking behavior, that is, if and when spikes are generated due to a neuron's membrane potential exceeding that neuron's spiking threshold, and this spike timing encodes vital information. However, the threshold is generally treated as a hyperparameter, and incorrect selection can lead to neurons that do not spike for large portions of the training process, hindering the effective rate of learning. This work focuses on the significance of learning neuron thresholds alongside weights in SNNs. Our results suggest that promoting threshold from a hyperparameter to a trainable parameter effectively addresses the issue of dead neurons during training. This leads to a more robust training algorithm, resulting in improved convergence, increased test accuracy, and a substantial reduction in the number of training epochs required to achieve viable accuracy on spatiotemporal datasets such as NMNIST, DVS128, and Spiking Heidelberg Digits (SHD), with up to 30% training speed-up and up to 2% higher accuracy on these datasets.