Attacking the Spike: On the Transferability and Security of Spiking Neural Networks to Adversarial Examples

📅 2022-09-07

📈 Citations: 13

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work systematically investigates the underexplored problem of adversarial robustness in Spiking Neural Networks (SNNs). We find that white-box attacks against SNNs heavily rely on surrogate gradient techniques and exhibit severely limited transferability of adversarial examples across architectures (e.g., between SNNs and ViTs/CNNs). To address this, we first uncover a strong coupling mechanism between SNNs’ adversarial vulnerability and surrogate gradient estimation. Building on this insight, we propose Auto-SAGA—a cross-architecture universal white-box attack method that jointly optimizes adaptive self-attention gradient estimation and surrogate gradient approximation. Evaluated on CIFAR-10, CIFAR-100, and ImageNet, Auto-SAGA achieves a 91.1% improvement in attack success rate on SNN-ViT ensembles and attains three times the effectiveness of Auto-PGD on adversarially trained SNN ensembles, significantly outperforming existing baselines.

📝 Abstract

Spiking neural networks (SNNs) have attracted much attention for their high energy efficiency and for recent advances in their classification performance. However, unlike traditional deep learning approaches, the analysis and study of the robustness of SNNs to adversarial examples remain relatively underdeveloped. In this work, we focus on advancing the adversarial attack side of SNNs and make three major contributions. First, we show that successful white-box adversarial attacks on SNNs are highly dependent on the underlying surrogate gradient technique, even in the case of adversarially trained SNNs. Second, using the best surrogate gradient technique, we analyze the transferability of adversarial attacks on SNNs and other state-of-the-art architectures like Vision Transformers (ViTs) and Big Transfer Convolutional Neural Networks (CNNs). We demonstrate that the adversarial examples created by non-SNN architectures are not misclassified often by SNNs. Third, due to the lack of an ubiquitous white-box attack that is effective across both the SNN and CNN/ViT domains, we develop a new white-box attack, the Auto Self-Attention Gradient Attack (Auto-SAGA). Our novel attack generates adversarial examples capable of fooling both SNN and non-SNN models simultaneously. Auto-SAGA is as much as $91.1%$ more effective on SNN/ViT model ensembles and provides a $3 imes$ boost in attack effectiveness on adversarially trained SNN ensembles compared to conventional white-box attacks like Auto-PGD. Our experiments and analyses are broad and rigorous covering three datasets (CIFAR-10, CIFAR-100 and ImageNet), five different white-box attacks and nineteen classifier models (seven for each CIFAR dataset and five models for ImageNet).

Problem

Research questions and friction points this paper is trying to address.

Investigating SNN adversarial robustness using surrogate gradient estimation techniques

Analyzing transferability gaps between SNNs and non-SNN architectures like ViTs

Proposing MDSE attack that dynamically combines multiple surrogate gradients

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic combination of multiple surrogate gradients

Adversarial examples transfer between SNNs and non-SNNs

Enhanced attack effectiveness on diverse neural architectures

🔎 Similar Papers

Benchmarking Spiking Neural Network Learning Methods with Varying Locality