Deep learning models are vulnerable, but adversarial examples are even more vulnerable

📅 2025-11-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the fundamental distinction between adversarial and clean samples under local occlusion, proposing an efficient, adversarial-training-free detection method. The core innovation is Sliding Mask Confidence Entropy (SMCE), a metric that quantifies the volatility of model prediction confidence under sliding-window occlusion; adversarial samples exhibit significantly higher SMCE values than clean ones. Based on this observation, we design SWM-AED—a lightweight detection framework—and introduce Mask Entropy Field Maps for interpretable visualization and statistical modeling. Extensive evaluation on CIFAR-10 across diverse attack types (e.g., FGSM, PGD, CW) and architectures (ResNet, VGG, DenseNet) demonstrates consistent detection accuracy exceeding 62%, with a peak of 96.5%. The method exhibits strong robustness, cross-attack and cross-architecture generalizability, and deployment efficiency—requiring no retraining or architectural modification.

Technology Category

Application Category

📝 Abstract
Understanding intrinsic differences between adversarial examples and clean samples is key to enhancing DNN robustness and detection against adversarial attacks. This study first empirically finds that image-based adversarial examples are notably sensitive to occlusion. Controlled experiments on CIFAR-10 used nine canonical attacks (e.g., FGSM, PGD) to generate adversarial examples, paired with original samples for evaluation. We introduce Sliding Mask Confidence Entropy (SMCE) to quantify model confidence fluctuation under occlusion. Using 1800+ test images, SMCE calculations supported by Mask Entropy Field Maps and statistical distributions show adversarial examples have significantly higher confidence volatility under occlusion than originals. Based on this, we propose Sliding Window Mask-based Adversarial Example Detection (SWM-AED), which avoids catastrophic overfitting of conventional adversarial training. Evaluations across classifiers and attacks on CIFAR-10 demonstrate robust performance, with accuracy over 62% in most cases and up to 96.5%.
Problem

Research questions and friction points this paper is trying to address.

Detecting adversarial examples by analyzing their sensitivity to occlusion patterns
Quantifying model confidence fluctuations under occlusion using SMCE metric
Developing SWM-AED detection method to avoid catastrophic overfitting issues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sliding Mask Confidence Entropy quantifies confidence fluctuation
Sliding Window Mask-based detection avoids catastrophic overfitting
Method leverages adversarial examples' higher sensitivity to occlusion
🔎 Similar Papers
No similar papers found.
J
Jun Li
School of Management Science and Information Engineering, Jilin University of Finance and Economics, Jingyue Street, Changchun 130117, China
Y
Yanwei Xu
School of Management Science and Information Engineering, Jilin University of Finance and Economics, Jingyue Street, Changchun 130117, China
K
Keran Li
School of Management Science and Information Engineering, Jilin University of Finance and Economics, Jingyue Street, Changchun 130117, China
Xiaoli Zhang
Xiaoli Zhang
Jilin University
image fusiondata mining,image segmentation,deep learning