🤖 AI Summary
Evaluating system robustness under distributional shift remains challenging due to the difficulty of characterizing worst-case perturbations. Method: This paper proposes a continuous distributionally robust optimization (DRO) framework grounded in Wasserstein geometry. Leveraging Brenier’s theorem, it models the worst-case distribution as the pushforward of a reference measure under a differentiable transport map—bypassing sampling and support-set constraints inherent in discrete DRO. A neural network parameterizes the transport map, enabling joint end-to-end optimization of both the decision model and the map via gradient descent-ascent (GDA), without sampling. Results: Experiments on synthetic and image datasets demonstrate that the method efficiently generates risk-controlled worst-case distributions with strong convergence, generalization, and scalability. It establishes a novel paradigm for robustness evaluation and stress testing under distributional uncertainty.
📝 Abstract
Worst-case generation plays a critical role in evaluating robustness and stress-testing systems under distribution shifts, in applications ranging from machine learning models to power grids and medical prediction systems. We develop a generative modeling framework for worst-case generation for a pre-specified risk, based on min-max optimization over continuous probability distributions, namely the Wasserstein space. Unlike traditional discrete distributionally robust optimization approaches, which often suffer from scalability issues, limited generalization, and costly worst-case inference, our framework exploits the Brenier theorem to characterize the least favorable (worst-case) distribution as the pushforward of a transport map from a continuous reference measure, enabling a continuous and expressive notion of risk-induced generation beyond classical discrete DRO formulations. Based on the min-max formulation, we propose a Gradient Descent Ascent (GDA)-type scheme that updates the decision model and the transport map in a single loop, establishing global convergence guarantees under mild regularity assumptions and possibly without convexity-concavity. We also propose to parameterize the transport map using a neural network that can be trained simultaneously with the GDA iterations by matching the transported training samples, thereby achieving a simulation-free approach. The efficiency of the proposed method as a risk-induced worst-case generator is validated by numerical experiments on synthetic and image data.