🤖 AI Summary
Second-order nonlinear dynamics in multi-state spiking neurons (e.g., AdLIF) cause instability during training and inference. Method: This work establishes, for the first time, a rigorous mathematical correspondence between state-space models (SSMs) and second-order spiking neurons, leading to two novel neuron designs: (1) a logarithmically reparameterized AdLIF variant that improves gradient flow and parameter interpretability; and (2) an oscillatory spiking neuron based on complex-diagonal SSM structure, explicitly modeling periodic dynamics. The approach integrates timestep-wise gradient training, event-driven computation, and end-to-end optimization on raw audio. Contribution/Results: Our method achieves state-of-the-art or superior performance on event-camera and speech recognition benchmarks, while using fewer parameters, lower memory footprint, and higher spike sparsity—significantly enhancing scalability and the performance-efficiency trade-off of large-scale spiking neural networks.
📝 Abstract
Multi-state spiking neurons such as the adaptive leaky integrate-and-fire (AdLIF) neuron offer compelling alternatives to conventional deep learning models thanks to their sparse binary activations, second-order nonlinear recurrent dynamics, and efficient hardware realizations. However, such internal dynamics can cause instabilities during inference and training, often limiting performance and scalability. Meanwhile, state space models (SSMs) excel in long sequence processing using linear state-intrinsic recurrence resembling spiking neurons' subthreshold regime. Here, we establish a mathematical bridge between SSMs and second-order spiking neuron models. Based on structure and parametrization strategies of diagonal SSMs, we propose two novel spiking neuron models. The first extends the AdLIF neuron through timestep training and logarithmic reparametrization to facilitate training and improve final performance. The second additionally brings initialization and structure from complex-state SSMs, broadening the dynamical regime to oscillatory dynamics. Together, our two models achieve beyond or near state-of-the-art (SOTA) performances for reset-based spiking neuron models across both event-based and raw audio speech recognition datasets. We achieve this with a favorable number of parameters and required dynamic memory while maintaining high activity sparsity. Our models demonstrate enhanced scalability in network size and strike a favorable balance between performance and efficiency with respect to SSM models.