Wasserstein distributional adversarial training for deep neural networks

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of enhancing deep neural network robustness under concurrent distributional shift and adversarial perturbations. We propose a Wasserstein-distance-based distributed adversarial training method that, for the first time, incorporates sensitivity analysis from Wasserstein distributionally robust optimization (DRO) into the adversarial training framework—extending TRADES to jointly optimize both distributional and pointwise robustness. Our approach requires no additional data, enabling efficient fine-tuning of pre-trained models using only the standard 50k CIFAR-10 training samples. Experiments on RobustBench demonstrate that our method significantly improves Wasserstein distributional robustness while strictly preserving the strong pointwise robustness of baseline models. Notably, it delivers consistent performance gains even when applied to large-scale models pre-trained on millions of synthetic samples, using only small-scale real data. This establishes a practical, data-efficient pathway toward simultaneously strengthening both types of robustness without architectural or data augmentation overhead.

Technology Category

Application Category

📝 Abstract
Design of adversarial attacks for deep neural networks, as well as methods of adversarial training against them, are subject of intense research. In this paper, we propose methods to train against distributional attack threats, extending the TRADES method used for pointwise attacks. Our approach leverages recent contributions and relies on sensitivity analysis for Wasserstein distributionally robust optimization problems. We introduce an efficient fine-tuning method which can be deployed on a previously trained model. We test our methods on a range of pre-trained models on RobustBench. These experimental results demonstrate the additional training enhances Wasserstein distributional robustness, while maintaining original levels of pointwise robustness, even for already very successful networks. The improvements are less marked for models pre-trained using huge synthetic datasets of 20-100M images. However, remarkably, sometimes our methods are still able to improve their performance even when trained using only the original training dataset (50k images).
Problem

Research questions and friction points this paper is trying to address.

Enhance robustness against distributional adversarial attacks
Extend TRADES method for pointwise attacks
Fine-tune pre-trained models for improved performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wasserstein distributional adversarial training
Fine-tuning pre-trained models
Sensitivity analysis for robustness
🔎 Similar Papers
No similar papers found.