Fast Adversarial Training against Sparse Attacks Requires Loss Smoothing

📅 2025-02-28

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses catastrophic overfitting (CO) and performance degradation in $l_0$-sparse adversarial training when using single-step fast methods. We identify two root causes: (i) one-step attacks under $l_0$ constraints struggle to locate optimal perturbation positions, and (ii) the $l_0$ loss landscape exhibits severe oscillations—more rugged than those of $l_infty$, $l_2$, or $l_1$ norms. To mitigate this, we propose a novel loss surface smoothing paradigm that jointly incorporates soft-label distillation and a weighted cross-entropy loss, integrated with efficient $l_0$-specific attacks (e.g., PerlinAttack or Random+PGD variants), forming a robust single-step training framework. Our method completely eliminates CO, substantially narrows the robustness gap between single-step and multi-step training, and achieves state-of-the-art $l_0$-sparse robustness on CIFAR-10 and CIFAR-100.

Technology Category

Application Category

📝 Abstract

This paper studies fast adversarial training against sparse adversarial perturbations bounded by $l_0$ norm. We demonstrate the challenges of employing $1$-step attacks on $l_0$ bounded perturbations for fast adversarial training, including degraded performance and the occurrence of catastrophic overfitting (CO). We highlight that CO in $l_0$ adversarial training is caused by sub-optimal perturbation locations of $1$-step attack. Theoretical and empirical analyses reveal that the loss landscape of $l_0$ adversarial training is more craggy compared to its $l_infty$, $l_2$ and $l_1$ counterparts. Moreover, we corroborate that the craggy loss landscape can aggravate CO. To address these issues, we propose Fast-LS-$l_0$ that incorporates soft labels and the trade-off loss function to smooth the adversarial loss landscape. Extensive experiments demonstrate our method can overcome the challenge of catastrophic overfitting, achieve state-of-the-art performance, and narrow down the performance gap between $1$-step and multi-step adversarial training against sparse attacks.

Problem

Research questions and friction points this paper is trying to address.

Challenges in fast adversarial training against sparse $l_0$ perturbations.

Catastrophic overfitting caused by sub-optimal $1$-step attack perturbations.

Craggy loss landscape in $l_0$ adversarial training compared to $l_infty$, $l_2$, $l_1$.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fast-LS-l0 uses soft labels for smoothing.

Trade-off loss function mitigates overfitting issues.

Achieves state-of-the-art against sparse adversarial attacks.

🔎 Similar Papers

No similar papers found.

Authors to Follow