🤖 AI Summary
Vision Transformers (ViTs) exhibit severely degraded adversarial robustness under prompt tuning—a lightweight adaptation paradigm. Method: This paper proposes ADAPT, a lightweight adaptive adversarial training framework. It identifies gradient confusion in prompt tuning as the key mechanism undermining conventional adversarial training, and accordingly designs task-aware adaptive perturbation generation, learnable prompt embeddings, and gradient reweighting—enabling efficient PGD-based adversarial example construction directly in the prompt space. Contribution/Results: ADAPT optimizes only ~1% of model parameters yet achieves ~40% robust accuracy under standard adversarial attacks. Its performance matches that of state-of-the-art full-model fine-tuning methods while reducing parameter updates by 99%, significantly enhancing robust transferability under minimal parameter overhead.
📝 Abstract
The performance of deep models, including Vision Transformers, is known to be vulnerable to adversarial attacks. Many existing defenses against these attacks, such as adversarial training, rely on full-model fine-tuning to induce robustness in the models. These defenses require storing a copy of the entire model, that can have billions of parameters, for each task. At the same time, parameter-efficient prompt tuning is used to adapt large transformer-based models to downstream tasks without the need to save large copies. In this paper, we examine parameter-efficient prompt tuning of Vision Transformers for downstream tasks under the lens of robustness. We show that previous adversarial defense methods, when applied to the prompt tuning paradigm, suffer from gradient obfuscation and are vulnerable to adaptive attacks. We introduce ADAPT, a novel framework for performing adaptive adversarial training in the prompt tuning paradigm. Our method achieves competitive robust accuracy of ~40% w.r.t. SOTA robustness methods using full-model fine-tuning, by tuning only ~1% of the number of parameters.