SpaNN: Detecting Multiple Adversarial Patches on CNNs by Spanning Saliency Thresholds

📅 2025-06-23

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Current CNN models are vulnerable to physically realizable multi-patch adversarial attacks, while mainstream defenses either assume single-patch scenarios or suffer from high computational overhead and insufficient robustness under multi-patch threats. This paper proposes a lightweight, patch-count-agnostic detection framework: it dynamically generates multiple saliency thresholds from first-layer neural activations and constructs an ensemble of binarized feature maps; subsequently, cluster-level features are extracted via clustering for attack discrimination. By abandoning the fixed-threshold assumption, the method significantly enhances both robustness and detection efficiency against white-box multi-patch attacks. Evaluated on four benchmark datasets, it outperforms state-of-the-art defenses by 11% and 27% in object detection and image classification tasks, respectively.

Technology Category

Application Category

📝 Abstract

State-of-the-art convolutional neural network models for object detection and image classification are vulnerable to physically realizable adversarial perturbations, such as patch attacks. Existing defenses have focused, implicitly or explicitly, on single-patch attacks, leaving their sensitivity to the number of patches as an open question or rendering them computationally infeasible or inefficient against attacks consisting of multiple patches in the worst cases. In this work, we propose SpaNN, an attack detector whose computational complexity is independent of the expected number of adversarial patches. The key novelty of the proposed detector is that it builds an ensemble of binarized feature maps by applying a set of saliency thresholds to the neural activations of the first convolutional layer of the victim model. It then performs clustering on the ensemble and uses the cluster features as the input to a classifier for attack detection. Contrary to existing detectors, SpaNN does not rely on a fixed saliency threshold for identifying adversarial regions, which makes it robust against white box adversarial attacks. We evaluate SpaNN on four widely used data sets for object detection and classification, and our results show that SpaNN outperforms state-of-the-art defenses by up to 11 and 27 percentage points in the case of object detection and the case of image classification, respectively. Our code is available at https://github.com/gerkbyrd/SpaNN.

Problem

Research questions and friction points this paper is trying to address.

Detects multiple adversarial patches on CNNs

Addresses computational inefficiency in multi-patch defenses

Improves robustness against white-box adversarial attacks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble of binarized feature maps

Clustering on neural activations

Dynamic saliency thresholding

🔎 Similar Papers

Improving the Robustness of Object Detection and Classification AI models against Adversarial Patch Attacks