🤖 AI Summary
This study addresses the vulnerability of computer vision models to adversarial evasion attacks by proposing a novel white-box attack method grounded in SHAP (Shapley Additive Explanations) values. The approach leverages SHAP during inference to quantify the contribution of individual input features to the model’s output, enabling the generation of highly imperceptible adversarial examples. As the first work to integrate SHAP values into adversarial attack strategies, the proposed method demonstrates robust performance even in scenarios where gradient information is limited or obscured. Experimental results show that, compared to the classical Fast Gradient Sign Method (FGSM), this technique achieves superior effectiveness and stability in inducing misclassification while maintaining high attack stealth.
📝 Abstract
The paper introduces a white-box attack on computer vision models using SHAP values. It demonstrates how adversarial evasion attacks can compromise the performance of deep learning models by reducing output confidence or inducing misclassifications. Such attacks are particularly insidious as they can deceive the perception of an algorithm while eluding human perception due to their imperceptibility to the human eye. The proposed attack leverages SHAP values to quantify the significance of individual inputs to the output at the inference stage. A comparison is drawn between the SHAP attack and the well-known Fast Gradient Sign Method. We find evidence that SHAP attacks are more robust in generating misclassifications particularly in gradient hiding scenarios.