Are Fast Methods Stable in Adversarially Robust Transfer Learning?

📅 2025-06-27

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

Fast adversarial attacks like FGSM are widely considered unsuitable for robust fine-tuning—particularly due to their tendency toward catastrophic overfitting early in training. Method: This work systematically investigates whether FGSM-based adversarial fine-tuning can serve as a low-cost, scalable alternative to expensive PGD-based fine-tuning in robust transfer learning, focusing on parameter-efficient settings such as linear probing. Contribution/Results: Contrary to conventional wisdom, we demonstrate that FGSM fine-tuning exhibits remarkable stability and avoids catastrophic overfitting even under high perturbation magnitudes (ε = 4/8). Empirical evaluation across multiple benchmarks shows that FGSM fine-tuning incurs only marginal robust accuracy drops—0.39% and 1.39%—compared to PGD, while reducing training time by 75%. To our knowledge, this is the first study to establish FGSM’s robustness and scalability in adversarial fine-tuning, establishing a new paradigm for lightweight robust transfer learning.

Technology Category

Application Category

📝 Abstract

Transfer learning is often used to decrease the computational cost of model training, as fine-tuning a model allows a downstream task to leverage the features learned from the pre-training dataset and quickly adapt them to a new task. This is particularly useful for achieving adversarial robustness, as adversarially training models from scratch is very computationally expensive. However, high robustness in transfer learning still requires adversarial training during the fine-tuning phase, which requires up to an order of magnitude more time than standard fine-tuning. In this work, we revisit the use of the fast gradient sign method (FGSM) in robust transfer learning to improve the computational cost of adversarial fine-tuning. We surprisingly find that FGSM is much more stable in adversarial fine-tuning than when training from scratch. In particular, FGSM fine-tuning does not suffer from any issues with catastrophic overfitting at standard perturbation budgets of $varepsilon=4$ or $varepsilon=8$. This stability is further enhanced with parameter-efficient fine-tuning methods, where FGSM remains stable even up to $varepsilon=32$ for linear probing. We demonstrate how this stability translates into performance across multiple datasets. Compared to fine-tuning with the more commonly used method of projected gradient descent (PGD), on average, FGSM only loses 0.39% and 1.39% test robustness for $varepsilon=4$ and $varepsilon=8$ while using $4 imes$ less training time. Surprisingly, FGSM may not only be a significantly more efficient alternative to PGD in adversarially robust transfer learning but also a well-performing one.

Problem

Research questions and friction points this paper is trying to address.

Evaluating stability of FGSM in adversarial transfer learning

Reducing computational cost of adversarial fine-tuning

Comparing FGSM and PGD performance in robust transfer learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

FGSM stabilizes adversarial fine-tuning efficiently

Parameter-efficient methods enhance FGSM stability

FGSM reduces training time versus PGD significantly

🔎 Similar Papers

No similar papers found.