🤖 AI Summary
This work addresses the challenge of making robust decisions when historical data from the target distribution is unavailable, by leveraging related but distributionally distinct data sources. The authors propose a novel out-of-distribution stochastic optimization framework that integrates meta-distribution modeling with data-driven uncertainty sets constructed in a reproducing kernel Hilbert space (RKHS), yielding a minimax stochastic program with tunable conservatism. The approach provides rigorous out-of-distribution generalization guarantees and efficiently generates robust decisions using only a small number of related data sources. Empirical evaluations on multi-item newsvendor and portfolio optimization tasks demonstrate that the proposed framework significantly outperforms existing methods under unseen distributions.
📝 Abstract
Data-driven decision-making under uncertainty typically presumes the collection of historical data from an unknown target probability distribution. However, one may have no access to any data from the target distribution prior to decision-making. To address this challenge, we propose robust out-of-distribution stochastic optimization, a novel data-driven framework that effectively utilizes relevant data distributions for robust decision-making under unseen distributions. A key feature of our framework is that all data distributions are assumed to be randomly generated from a meta-distribution over distributions. To describe uncertainty in distribution generation, we propose to learn a data-driven uncertainty set in a reproducing kernel Hilbert space (RKHS) from relevant data distributions, with adjustable conservatism. We then incorporate this set into a min-max stochastic program to derive robust decisions. Notably, under randomness of distribution generation, we establish rigorous out-of-distribution generalization guarantees for the uncertainty set as well as the solution. To ease problem-solving in RKHS, an approximate parametrization with a provably bounded suboptimality and a row generation strategy are presented. Extensive numerical experiments on multi-item newsvendor and portfolio optimization demonstrate the superior out-of-distribution performance of our decision-making framework under unseen data distribution, even when only a small or moderate number of relevant sources are available.