🤖 AI Summary
This work addresses the high computational cost of adversarial training for large-scale Vision Transformers. To this end, it proposes Criticality-Aware Adversarial Training (CAAT), which, for the first time, integrates parameter criticality assessment with parameter-efficient fine-tuning (PEFT). CAAT adaptively identifies and fine-tunes only the approximately 6% of parameters most critical to robustness, substantially reducing computational overhead. Evaluated on three mainstream benchmarks, CAAT achieves competitive performance—sacrificing merely 4.3% in robust accuracy compared to standard adversarial training—while outperforming existing lightweight adversarial training methods. This approach enables scalable and efficient training of robust vision models without compromising significantly on adversarial robustness.
📝 Abstract
Vision Transformer (ViT) models have achieved remarkable performance across various vision tasks, with scalability being a key advantage when applied to large datasets. This scalability enables ViT models to exhibit strong generalization capabilities. However, as the number of parameters increases, the robustness of ViT models to adversarial examples does not scale proportionally. Adversarial training (AT), one of the most effective methods for enhancing robustness, typically requires fine-tuning the entire model, leading to prohibitively high computational costs, especially for large ViT architectures. In this paper, we aim to robustly fine-tune only a small subset of parameters to achieve robustness comparable to standard AT. To accomplish this, we introduce Criticality-Aware Adversarial Training (CAAT), a novel method that adaptively allocates resources to the most robustness-critical parameters, fine-tuning only selected modules. Specifically, CAAT efficiently identifies parameters that contribute most to adversarial robustness. It then leverages parameter-efficient fine-tuning (PEFT) to robustly adjust weight matrices where the number of critical parameters exceeds a predefined threshold. CAAT exhibits favorable generalization when scaled to larger vision transformer architectures, potentially paving the way for adversarial training at scale, e.g, compared with plain adversarial training, CAAT incurs only a 4.3% decrease in adversarial robustness while tuning approximately 6% of its parameters. Extensive experiments on three widely used adversarial learning datasets demonstrate that CAAT outperforms state-of-the-art lightweight AT methods with fewer trainable parameters.