🤖 AI Summary
This paper addresses the problem of overly permissive ambiguity sets in distributionally robust optimization (DRO), which lead to overfitting and poor generalization. To tackle this, we propose a novel ambiguity set construction paradigm integrating structural causal models (SCMs). Our key contribution is the first incorporation of structural equation information into ambiguity set definitions via structural causal optimal transport (SCOT) and its regularized relaxation—explicitly encoding causal mechanisms among variables and overcoming the limitation of conventional DRO methods that ignore causal structure. Methodologically, we combine optimal transport theory with difference-of-convex programming, replacing intractable causal constraints with computationally tractable regularizers, ensuring dimension-independent fast convergence. We provide finite-sample theoretical guarantees: our approach significantly mitigates the curse of dimensionality and achieves exponential shrinkage of the ambiguity set radius. Moreover, it remains robust under partial structural uncertainty.
📝 Abstract
Distributionally robust optimization tackles out-of-sample issues like overfitting and distribution shifts by adopting an adversarial approach over a range of possible data distributions, known as the ambiguity set. To balance conservatism and accuracy, these sets must include realistic probability distributions by leveraging information from the nominal distribution. Assuming that nominal distributions arise from a structural causal model with a directed acyclic graph $mathcal{G}$ and structural equations, previous methods such as adapted and $mathcal{G}$-causal optimal transport have only utilized causal graph information in designing ambiguity sets. In this work, we propose incorporating structural equations, which include causal graph information, to enhance ambiguity sets, resulting in more realistic distributions. We introduce structural causal optimal transport and its associated ambiguity set, demonstrating their advantages and connections to previous methods. A key benefit of our approach is a relaxed version, where a regularization term replaces the complex causal constraints, enabling an efficient algorithm via difference-of-convex programming to solve structural causal optimal transport. We also show that when structural information is absent and must be estimated, our approach remains effective and provides finite sample guarantees. Lastly, we address the radius of ambiguity sets, illustrating how our method overcomes the curse of dimensionality in optimal transport problems, achieving faster shrinkage with dimension-free order.