🤖 AI Summary
This work addresses the generation, robustness evaluation, and defense against sparse adversarial perturbations—both unstructured and structured. We propose Sparse-PGD, the first unified white-box/black-box collaborative framework built upon the PGD paradigm. It incorporates sparse-constrained projection and structured mask modeling to enable flexible, controllable sparse perturbation generation. Crucially, it is the first method to jointly perform white-box optimization and black-box transfer-based evaluation within a single framework. Moreover, it supports end-to-end sparse adversarial training. Extensive experiments across multiple benchmark models and attack settings demonstrate that Sparse-PGD significantly improves model generalization robustness against diverse sparse attacks. After sparse adversarial training, it achieves state-of-the-art robust accuracy, validating the framework’s unity, effectiveness, and practicality.
📝 Abstract
This work studies sparse adversarial perturbations, including both unstructured and structured ones. We propose a framework based on a white-box PGD-like attack method named Sparse-PGD to effectively and efficiently generate such perturbations. Furthermore, we combine Sparse-PGD with a black-box attack to comprehensively and more reliably evaluate the models' robustness against unstructured and structured sparse adversarial perturbations. Moreover, the efficiency of Sparse-PGD enables us to conduct adversarial training to build robust models against various sparse perturbations. Extensive experiments demonstrate that our proposed attack algorithm exhibits strong performance in different scenarios. More importantly, compared with other robust models, our adversarially trained model demonstrates state-of-the-art robustness against various sparse attacks.