Learning Scalable Temporal Representations in Spiking Neural Networks Without Labels

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

Spiking Neural Networks (SNNs) face significant challenges in scaling self-supervised learning to large-scale unlabeled data, primarily because spike discreteness disrupts cross-view gradient consistency, hindering optimization of contrastive and consistency objectives. Method: We propose a dual-path neuron architecture that jointly integrates a differentiable surrogate branch—enabling gradient propagation during training—and a genuine spiking branch—preserving full spike dynamics during inference. Coupled with cross-view and temporal alignment losses, this design enhances inter-sample representation consistency within both convolutional and Transformer-based SNNs. Contribution/Results: This work achieves the first full self-supervised pretraining of SNNs at ImageNet scale. Our Spikformer-16-512 model attains 70.1% top-1 accuracy on ImageNet-1K, demonstrating the feasibility of high-capacity SNNs for unsupervised learning at modern scales.

Technology Category

Application Category

📝 Abstract

Spiking neural networks (SNNs) exhibit temporal, sparse, and event-driven dynamics that make them appealing for efficient inference. However, extending these models to self-supervised regimes remains challenging because the discontinuities introduced by spikes break the cross-view gradient correspondences required by contrastive and consistency-driven objectives. This work introduces a training paradigm that enables large SNN architectures to be optimized without labeled data. We formulate a dual-path neuron in which a spike-generating process is paired with a differentiable surrogate branch, allowing gradients to propagate across augmented inputs while preserving a fully spiking implementation at inference. In addition, we propose temporal alignment objectives that enforce representational coherence both across spike timesteps and between augmented views. Using convolutional and transformer-style SNN backbones, we demonstrate ImageNet-scale self-supervised pretraining and strong transfer to classification, detection, and segmentation benchmarks. Our best model, a fully self-supervised Spikformer-16-512, achieves 70.1% top-1 accuracy on ImageNet-1K, demonstrating that unlabeled learning in high-capacity SNNs is feasible at modern scale

Problem

Research questions and friction points this paper is trying to address.

Developing self-supervised learning methods for spiking neural networks without labels

Overcoming gradient discontinuity issues in SNNs for contrastive learning objectives

Enabling scalable temporal representation learning in large SNN architectures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-path neuron with surrogate branch for gradient propagation

Temporal alignment objectives for representational coherence

Self-supervised pretraining with spiking transformer architectures

🔎 Similar Papers

Neural timescales from a computational perspective