Statistical Guarantees for Distributionally Robust Optimization with Optimal Transport and OT-Regularized Divergences

📅 2026-03-29

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This work investigates the finite-sample statistical performance of distributionally robust optimization (DRO) based on optimal transport (OT) and its regularized variants via f-divergences, with a focus on applications in adversarial training. By establishing concentration inequalities applicable to general OT cost functions, the study provides the first non-asymptotic guarantees for DRO under soft-constrained norm-ball OT neighborhoods. In the p-Wasserstein setting, it achieves improved dependence on the neighborhood size compared to prior results. The proposed approach integrates adversarial example generation with an adversarial reweighting mechanism, thereby not only extending the theoretical foundations of OT-DRO but also enhancing model robustness and empirical performance against adversarial perturbations.

Technology Category

Application Category

📝 Abstract

We study finite-sample statistical performance guarantees for distributionally robust optimization (DRO) with optimal transport (OT) and OT-regularized divergence model neighborhoods. Specifically, we derive concentration inequalities for supervised learning via DRO-based adversarial training, as commonly employed to enhance the adversarial robustness of machine learning models. Our results apply to a wide range of OT cost functions, beyond the $p$-Wasserstein case studied by previous authors. In particular, our results are the first to: 1) cover soft-constraint norm-ball OT cost functions; soft-constraint costs have been shown empirically to enhance robustness when used in adversarial training, 2) apply to the combination of adversarial sample generation and adversarial reweighting that is induced by using OT-regularized $f$-divergence model neighborhoods; the added reweighting mechanism has also been shown empirically to further improve performance. In addition, even in the $p$-Wasserstein case, our bounds exhibit better behavior as a function of the DRO neighborhood size than previous results when applied to the adversarial setting.

Problem

Research questions and friction points this paper is trying to address.

Distributionally Robust Optimization

Optimal Transport

Statistical Guarantees

Adversarial Training

OT-Regularized Divergences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributionally Robust Optimization

Optimal Transport

OT-Regularized Divergences