🤖 AI Summary
Spiking Neural Networks (SNNs) suffer from severe overfitting under small-scale neuromorphic data due to gradient mismatch and data scarcity.
Method: We propose a time-regularized training framework that exploits the concentration of temporal information in SNNs. Specifically, we design a time-varying Fisher information-guided decay regularizer, imposing stronger constraints during early critical timesteps to encourage learning of robust spatiotemporal features. The method integrates loss landscape visualization, learning curve analysis, and dynamic Fisher information tracking.
Results: Extensive ablation studies across multiple benchmarks—including CIFAR-10/100, ImageNet-100, DVS-CIFAR10, and N-Caltech101—demonstrate that our approach significantly mitigates overfitting, flattens the loss landscape, and improves generalization. It establishes an interpretable, time-aware regularization paradigm for brain-inspired few-shot learning.
📝 Abstract
Spiking Neural Networks (SNNs) have received widespread attention due to their event-driven and low-power characteristics, making them particularly effective for processing event-based neuromorphic data. Recent studies have shown that directly trained SNNs suffer from severe overfitting issues due to the limited scale of neuromorphic datasets and the gradient mismatching problem, which fundamentally constrain their generalization performance. In this paper, we propose a temporal regularization training (TRT) method by introducing a time-dependent regularization mechanism to enforce stronger constraints on early timesteps. We compare the performance of TRT with other state-of-the-art methods performance on datasets including CIFAR10/100, ImageNet100, DVS-CIFAR10, and N-Caltech101. To validate the effectiveness of TRT, we conducted ablation studies and analyses including loss landscape visualization and learning curve analysis, demonstrating that TRT can effectively mitigate overfitting and flatten the training loss landscape, thereby enhancing generalizability. Furthermore, we establish a theoretical interpretation of TRT's temporal regularization mechanism based on the results of Fisher information analysis. We analyze the temporal information dynamics inside SNNs by tracking Fisher information during the TRT training process, revealing the Temporal Information Concentration (TIC) phenomenon, where Fisher information progressively concentrates in early timesteps. The time-decaying regularization mechanism implemented in TRT effectively guides the network to learn robust features in early timesteps with rich information, thereby leading to significant improvements in model generalization. Code is available at https://github.com/ZBX05/Temporal-Regularization-Training.