🤖 AI Summary
Deep spiking neural networks (SNNs) suffer from spike attenuation and information loss due to membrane potential binarization during spike generation. Method: This paper proposes the first theory-driven weight initialization method explicitly designed for SNN spike dynamics—distinct from heuristic adaptations of artificial neural network (ANN) initialization schemes. It rigorously models membrane potential evolution and spike generation, deriving optimal initialization criteria that guarantee stable multi-layer spike propagation. Results: The method enables lossless inter-temporal spike transmission in 100-layer SNNs; improves accuracy and accelerates convergence on MNIST; and exhibits strong robustness to critical hyperparameters—including neuronal threshold and firing rate. By bridging the theoretical gap between SNN dynamics and initialization, this work overcomes the fundamental limitation of conventional initialization strategies in deep SNNs, providing both theoretical foundations and a practical framework for training scalable, deep SNNs.
📝 Abstract
Spiking Neural Networks (SNNs) and neuromorphic computing offer bio-inspired advantages such as sparsity and ultra-low power consumption, providing a promising alternative to conventional networks. However, training deep SNNs from scratch remains a challenge, as SNNs process and transmit information by quantizing the real-valued membrane potentials into binary spikes. This can lead to information loss and vanishing spikes in deeper layers, impeding effective training. While weight initialization is known to be critical for training deep neural networks, what constitutes an effective initial state for a deep SNN is not well-understood. Existing weight initialization methods designed for conventional networks (ANNs) are often applied to SNNs without accounting for their distinct computational properties. In this work we derive an optimal weight initialization method specifically tailored for SNNs, taking into account the quantization operation. We show theoretically that, unlike standard approaches, this method enables the propagation of activity in deep SNNs without loss of spikes. We demonstrate this behavior in numerical simulations of SNNs with up to 100 layers across multiple time steps. We present an in-depth analysis of the numerical conditions, regarding layer width and neuron hyperparameters, which are necessary to accurately apply our theoretical findings. Furthermore, our experiments on MNIST demonstrate higher accuracy and faster convergence when using the proposed weight initialization scheme. Finally, we show that the newly introduced weight initialization is robust against variations in several network and neuron hyperparameters.