Concept-Based Masking: A Patch-Agnostic Defense Against Adversarial Patch Attacks

📅 2025-10-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing defenses against adversarial patch attacks rely heavily on prior knowledge of patch location or size, limiting their generalizability to unseen physical patches. Method: This paper proposes a patch-agnostic robust defense framework that leverages Concept Activation Vectors (CAVs) for feature attribution, identifying and suppressing the most perturbation-sensitive semantic concepts in model decisions—without explicitly detecting patch location or dimensions. Contribution/Results: It is the first work to jointly model concept interpretability and robustness, enabling generalized defense against physical patches of arbitrary size and position. Evaluated on the Imagenette dataset using ResNet-50, the method achieves higher robust accuracy and clean accuracy than PatchCleanser, demonstrating superior stability and practicality across diverse scenarios.

Technology Category

Application Category

📝 Abstract
Adversarial patch attacks pose a practical threat to deep learning models by forcing targeted misclassifications through localized perturbations, often realized in the physical world. Existing defenses typically assume prior knowledge of patch size or location, limiting their applicability. In this work, we propose a patch-agnostic defense that leverages concept-based explanations to identify and suppress the most influential concept activation vectors, thereby neutralizing patch effects without explicit detection. Evaluated on Imagenette with a ResNet-50, our method achieves higher robust and clean accuracy than the state-of-the-art PatchCleanser, while maintaining strong performance across varying patch sizes and locations. Our results highlight the promise of combining interpretability with robustness and suggest concept-driven defenses as a scalable strategy for securing machine learning models against adversarial patch attacks.
Problem

Research questions and friction points this paper is trying to address.

Defending against adversarial patch attacks without prior patch knowledge
Suppressing influential concept activations to neutralize patch effects
Achieving robust accuracy across varying patch sizes and locations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages concept-based explanations for defense
Suppresses influential concept activation vectors
Neutralizes patch effects without explicit detection
🔎 Similar Papers
No similar papers found.