🤖 AI Summary
This work addresses the lack of robustness in multi-objective optimization under distributional shifts, as existing methods fail to explicitly model worst-case performance across objectives. The paper introduces, for the first time, a Distributionally Robust Multi-Objective Optimization (DR-MOO) framework that minimizes each objective’s loss under its worst-case perturbed distribution. It proposes the ε-Pareto stationary point as the solution concept and develops a single-loop, double-clipped Multiple Gradient Descent Algorithm (MGDA). By integrating Lagrangian duality reformulation with gradient clipping, the algorithm handles non-convex, unbounded, and biased gradient settings. Theoretically, it achieves a sample complexity of 𝒪(ε⁻⁴), significantly improving upon existing approaches, while experiments demonstrate performance on par with state-of-the-art MGDA baselines.
📝 Abstract
Multi-objective optimization (MOO) has received growing attention in applications that require learning under multiple criteria. However, the existing MOO formulations do not explicitly account for distributional shifts in the data. We introduce distributionally robust multi-objective optimization (DR-MOO), which minimizes multiple objectives under their respective worst-case distributions. We propose Pareto-type solution concepts for DR-MOO and develop multi-gradient descent algorithms (MGDA) with provable guarantees. Leveraging a Lagrangian dual reformulation, we first design a double-loop MGDA that uses an inner loop to estimate dual variables and achieves a total sample complexity $\mathcal{O}(ε^{-12})$ for reaching an $ε$-Pareto-stationary point. To further improve efficiency, we incorporate gradient clipping to handle generalized-smooth and biased gradient estimates, removing the need for double sampling. This yields a single-loop double-clip MGDA with substantially improved sample complexity $\mathcal{O}(ε^{-4})$. Our theory applies to the nonconvex setting and does not require bounded objectives or gradients. Experiments demonstrate that our methods are competitive with state-of-the-art MGDA baselines.