🤖 AI Summary
Binary spiking neural networks (SNNs) still incur excessive memory and computational overhead on edge devices. To address this, we propose Sub-bit SNNs (S²NNs), the first framework achieving efficient weight quantization below 1 bit per weight. Our method introduces two key innovations: (1) Outlier-Sensitive Sub-bit Quantization (OS-Quant), which mitigates accuracy degradation at ultra-low bit-widths via adaptive outlier detection, dynamic scaling, and codebook optimization; and (2) Membrane-Potential-based Feature Distillation (MPFD), which transfers knowledge from a teacher model at the neural dynamical level by distilling spatiotemporal membrane potential trajectories. Experiments across vision and non-vision benchmarks demonstrate that S²NNs significantly outperform existing quantized SNNs—achieving higher accuracy while drastically reducing memory footprint and synaptic operation counts. These results validate S²NNs’ superiority and practicality for resource-constrained edge deployment.
📝 Abstract
Spiking Neural Networks (SNNs) offer an energy-efficient paradigm for machine intelligence, but their continued scaling poses challenges for resource-limited deployment. Despite recent advances in binary SNNs, the storage and computational demands remain substantial for large-scale networks. To further explore the compression and acceleration potential of SNNs, we propose Sub-bit Spiking Neural Networks (S$^2$NNs) that represent weights with less than one bit. Specifically, we first establish an S$^2$NN baseline by leveraging the clustering patterns of kernels in well-trained binary SNNs. This baseline is highly efficient but suffers from extit{outlier-induced codeword selection bias} during training. To mitigate this issue, we propose an extit{outlier-aware sub-bit weight quantization} (OS-Quant) method, which optimizes codeword selection by identifying and adaptively scaling outliers. Furthermore, we propose a extit{membrane potential-based feature distillation} (MPFD) method, improving the performance of highly compressed S$^2$NN via more precise guidance from a teacher model. Extensive results on vision and non-vision tasks reveal that S$^2$NN outperforms existing quantized SNNs in both performance and efficiency, making it promising for edge computing applications.