Adversarially Robust Deep Learning with Optimal-Transport-Regularized Divergences

📅 2023-09-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the insufficient adversarial robustness of deep models, this paper proposes ARMOR_D, a distributionally robust optimization framework regularized by an optimal transport (OT)-based divergence. Innovatively, it unifies sample perturbation costs and weight reassignment through the infimal convolution of information divergence and OT cost, thereby transcending conventional single-point perturbation paradigms and supporting both continuous and discrete data domains. Theoretically, ARMOR_D integrates OT theory, information geometry, and distributionally robust optimization. Empirically, on MNIST, it achieves robust accuracies of 98.29% under FGSM and 98.18% under PGD⁴⁰—reducing error rates by 19.7% and 37.2%, respectively. In malware detection, it improves robust accuracy by 37.0%, while decreasing false positive and false negative rates by 57.53% and 51.1%, respectively.

📝 Abstract

We introduce the $ARMOR_D$ methods as novel approaches to enhancing the adversarial robustness of deep learning models. These methods are based on a new class of optimal-transport-regularized divergences, constructed via an infimal convolution between an information divergence and an optimal-transport (OT) cost. We use these as tools to enhance adversarial robustness by maximizing the expected loss over a neighborhood of distributions, a technique known as distributionally robust optimization. Viewed as a tool for constructing adversarial samples, our method allows samples to be both transported, according to the OT cost, and re-weighted, according to the information divergence. We demonstrate the effectiveness of our method on malware detection and image recognition applications and find that, to our knowledge, it outperforms existing methods at enhancing the robustness against adversarial attacks. $ARMOR_D$ yields the robustified accuracy of $98.29%$ against $FGSM$ and $98.18%$ against $PGD^{40}$ on the MNIST dataset, reducing the error rate by more than $19.7%$ and $37.2%$ respectively compared to prior methods. Similarly, in malware detection, a discrete (binary) data domain, $ARMOR_D$ improves the robustified accuracy under $rFGSM^{50}$ attack compared to the previous best-performing adversarial training methods by $37.0%$ while lowering false negative and false positive rates by $51.1%$ and $57.53%$, respectively.

Problem

Research questions and friction points this paper is trying to address.

Enhancing adversarial robustness in deep learning models

Combining optimal transport and information divergence for DRO

Improving performance against adversarial attacks on image recognition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal transport regularized divergences enhance robustness

Dynamical adversarial re-weighting improves sample transport

Generalizes best-performing loss functions in adversarial training

🔎 Similar Papers

No similar papers found.

Authors to Follow