🤖 AI Summary
To address gradient instability and optimization limitations in spiking neural networks (SNNs) arising from the mismatch between fixed surrogate gradients (SGs) and dynamic membrane potential dynamics (MPD), this paper proposes an MPD-adaptive surrogate gradient method. We first establish a dynamic coupling mechanism between surrogate gradients and multi-timestep membrane potential evolution, enabling real-time alignment of gradient estimation with MPD through temporal-aware calibration of the gradient-available interval and tunable degrees of freedom. This approach overcomes the generalization bottleneck of fixed SGs while preserving the direct training paradigm of SNNs. It significantly improves classification accuracy in low-latency regimes, substantially increases the proportion of effectively activated neuronal gradients, and achieves state-of-the-art performance across multiple benchmark datasets.
📝 Abstract
Brain-inspired spiking neural networks (SNNs) are recognized as a promising avenue for achieving efficient, low-energy neuromorphic computing. Recent advancements have focused on directly training high-performance SNNs by estimating the approximate gradients of spiking activity through a continuous function with constant sharpness, known as surrogate gradient (SG) learning. However, as spikes propagate among neurons, the distribution of membrane potential dynamics (MPD) will deviate from the gradient-available interval of fixed SG, hindering SNNs from searching the optimal solution space. To maintain the stability of gradient flows, SG needs to align with evolving MPD. Here, we propose adaptive gradient learning for SNNs by exploiting MPD, namely MPD-AGL. It fully accounts for the underlying factors contributing to membrane potential shifts and establishes a dynamic association between SG and MPD at different timesteps to relax gradient estimation, which provides a new degree of freedom for SG learning. Experimental results demonstrate that our method achieves excellent performance at low latency. Moreover, it increases the proportion of neurons that fall into the gradient-available interval compared to fixed SG, effectively mitigating the gradient vanishing problem.