๐ค AI Summary
This work addresses the challenge of training spiking neural networks, whose non-differentiable spiking functions necessitate surrogate gradients that accumulate approximation errors and hinder convergence to global optima. For the first time, the authors extend a convex optimization framework to parallel recurrent threshold networksโa class that subsumes spiking networks as a special caseโand introduce a parameter reconstruction algorithm that obviates the need for surrogate gradients, thereby enabling globally optimal training. The proposed method exhibits strong scalability and configuration robustness, consistently outperforming existing approaches across diverse tasks. Notably, it achieves superior performance both when used independently and when integrated with surrogate-gradient methods, demonstrating its data scalability and model stability.
๐ Abstract
Spiking Neural Networks (SNNs) have been proposed as biologically plausible and energy-efficient alternatives to conventional Artificial Neural Networks (ANNs). However, the training of SNN usually relies on surrogate gradients due to the non-differentiability of the spike function, introducing approximation errors that accumulate across layers. To address this challenge, we extend the work on convexification of parallel feedforward threshold networks to parallel recurrent threshold networks, which subsume parallel SNNs as a structured special case. Building on this theoretical framework, we propose a parameter reconstruction algorithm for SNN training that demonstrates consistent and significant advantages across various tasks, both as a standalone method and in combination with surrogate-gradient training. The ablations further demonstrate the data scalability and robustness to model configurations of our training algorithm, pointing toward its potential in large-scale SNN training.