๐ค AI Summary
Spike-based surrogate gradients in spiking neural networks (SNNs) suffer from severe distortion under strong adversarial attacks due to non-differentiability of binary spike activations. To address this, we propose Adaptive Sharpness-aware Surrogate Gradients (ASSG) and Stable Adaptive Projected Gradient Descent (SA-PGD) attacksโfirst enabling dynamic evolution of surrogate gradient shapes conditioned on input distributions while ensuring robust optimization under Lโ constraints despite gradient inaccuracy. Our framework integrates theoretical analysis, adaptive surrogate function design, and an adaptive-step-size PGD mechanism. Experiments demonstrate substantially improved attack success rates across diverse SNN architectures, neuron models, and adversarial training paradigms. Empirical results reveal that current SNN robustness is systematically overestimated. This work establishes the first benchmark framework for trustworthy SNN security evaluation, combining theoretical rigor with practical efficacy.
๐ Abstract
Spiking Neural Networks (SNNs) utilize spike-based activations to mimic the brain's energy-efficient information processing. However, the binary and discontinuous nature of spike activations causes vanishing gradients, making adversarial robustness evaluation via gradient descent unreliable. While improved surrogate gradient methods have been proposed, their effectiveness under strong adversarial attacks remains unclear. We propose a more reliable framework for evaluating SNN adversarial robustness. We theoretically analyze the degree of gradient vanishing in surrogate gradients and introduce the Adaptive Sharpness Surrogate Gradient (ASSG), which adaptively evolves the shape of the surrogate function according to the input distribution during attack iterations, thereby enhancing gradient accuracy while mitigating gradient vanishing. In addition, we design an adversarial attack with adaptive step size under the $L_infty$ constraint-Stable Adaptive Projected Gradient Descent (SA-PGD), achieving faster and more stable convergence under imprecise gradients. Extensive experiments show that our approach substantially increases attack success rates across diverse adversarial training schemes, SNN architectures and neuron models, providing a more generalized and reliable evaluation of SNN adversarial robustness. The experimental results further reveal that the robustness of current SNNs has been significantly overestimated and highlighting the need for more dependable adversarial training methods.