Towards more transferable adversarial attack in black-box manner

📅 2025-05-23

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

To address the strong dependency on surrogate models and high computational overhead (e.g., in DiffPGD) in black-box transferable adversarial attacks, this paper proposes a lightweight, transferable attack framework. Methodologically, it eliminates the expensive diffusion denoising process and, for the first time, incorporates the time-varying score from classifier-guided diffusion models directly into the adversarial optimization objective. A natural-distribution-aware loss function is designed and coupled with a lightweight surrogate model. This approach explicitly models data priors and decouples reliance on specific model architectures. Experiments on ImageNet and CIFAR benchmarks show that the method achieves an average 12.7% improvement in cross-model transfer success rate and maintains strong robustness against diffusion-based defenses such as DiffPure. Moreover, it accelerates inference by 8.3× over DiffPGD and reduces VRAM consumption by 76%.

Technology Category

Application Category

📝 Abstract

Adversarial attacks have become a well-explored domain, frequently serving as evaluation baselines for model robustness. Among these, black-box attacks based on transferability have received significant attention due to their practical applicability in real-world scenarios. Traditional black-box methods have generally focused on improving the optimization framework (e.g., utilizing momentum in MI-FGSM) to enhance transferability, rather than examining the dependency on surrogate white-box model architectures. Recent state-of-the-art approach DiffPGD has demonstrated enhanced transferability by employing diffusion-based adversarial purification models for adaptive attacks. The inductive bias of diffusion-based adversarial purification aligns naturally with the adversarial attack process, where both involving noise addition, reducing dependency on surrogate white-box model selection. However, the denoising process of diffusion models incurs substantial computational costs through chain rule derivation, manifested in excessive VRAM consumption and extended runtime. This progression prompts us to question whether introducing diffusion models is necessary. We hypothesize that a model sharing similar inductive bias to diffusion-based adversarial purification, combined with an appropriate loss function, could achieve comparable or superior transferability while dramatically reducing computational overhead. In this paper, we propose a novel loss function coupled with a unique surrogate model to validate our hypothesis. Our approach leverages the score of the time-dependent classifier from classifier-guided diffusion models, effectively incorporating natural data distribution knowledge into the adversarial optimization process. Experimental results demonstrate significantly improved transferability across diverse model architectures while maintaining robustness against diffusion-based defenses.

Problem

Research questions and friction points this paper is trying to address.

Enhancing black-box adversarial attack transferability efficiently

Reducing computational costs in diffusion-based adversarial purification

Improving robustness against diverse model architectures and defenses

Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel loss function enhances adversarial transferability

Unique surrogate model reduces computational overhead

Leverages time-dependent classifier scores for optimization

🔎 Similar Papers

No similar papers found.