Practical No-box Adversarial Attacks with Training-free Hybrid Image Transformation

📅 2022-03-09
🏛️ arXiv.org
📈 Citations: 20
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the most stringent no-box adversarial attack setting—characterized by no model access, no training data, and no knowledge of model parameters or architecture. We propose a training-free, frequency-domain hybrid image transformation method that suppresses original high-frequency components while injecting structured, regionally uniform, and densely repetitive high-frequency noise. This enables real-time cross-model attacks. We provide the first theoretical proof of the existence of training-free adversarial perturbations under the no-box setting, uncovering the critical role of high-frequency components in deep neural network (DNN) classification decisions, and establishing design principles for controllable high-frequency noise. Evaluated on ImageNet against ten mainstream models, our method achieves an average attack success rate of 98.13%, outperforming state-of-the-art no-box methods by 29.39% and matching the performance of transfer-based black-box attacks.
📝 Abstract
In recent years, the adversarial vulnerability of deep neural networks (DNNs) has raised increasing attention. Among all the threat models, no-box attacks are the most practical but extremely challenging since they neither rely on any knowledge of the target model or similar substitute model, nor access the dataset for training a new substitute model. Although a recent method has attempted such an attack in a loose sense, its performance is not good enough and computational overhead of training is expensive. In this paper, we move a step forward and show the existence of a extbf{training-free} adversarial perturbation under the no-box threat model, which can be successfully used to attack different DNNs in real-time. Motivated by our observation that high-frequency component (HFC) domains in low-level features and plays a crucial role in classification, we attack an image mainly by manipulating its frequency components. Specifically, the perturbation is manipulated by suppression of the original HFC and adding of noisy HFC. We empirically and experimentally analyze the requirements of effective noisy HFC and show that it should be regionally homogeneous, repeating and dense. Extensive experiments on the ImageNet dataset demonstrate the effectiveness of our proposed no-box method. It attacks ten well-known models with a success rate of extbf{98.13%} on average, which outperforms state-of-the-art no-box attacks by extbf{29.39%}. Furthermore, our method is even competitive to mainstream transfer-based black-box attacks.
Problem

Research questions and friction points this paper is trying to address.

Develops training-free adversarial attack method
Attacks DNNs by manipulating image frequency components
Achieves high success rate without model or dataset knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free adversarial perturbation technique
Manipulates high-frequency components in images
Achieves 98.13% attack success rate