🤖 AI Summary
This paper addresses contextual optimization under covariate shift—where test-time covariate distributions deviate from the training distribution, causing conventional methods to suffer from upstream estimation bias and suboptimal downstream decisions. To tackle this, we propose a novel robust distributional ambiguity set defined as the intersection of Wasserstein balls centered at nonparametric (kernel density) and parametric model estimates, jointly balancing flexibility and structured prior knowledge. We further reformulate the resulting robust optimization problem into a computationally tractable form and establish the first statistical concentration guarantee for contextual optimization under covariate shift. Empirical evaluation on revenue forecasting and portfolio optimization demonstrates that our method significantly outperforms baselines, achieving superior trade-offs among robustness, generalization, and computational efficiency—thereby empirically validating the tightness of our theoretical bounds.
📝 Abstract
In contextual optimization, a decision-maker observes historical samples of uncertain variables and associated concurrent covariates, without knowing their joint distribution. Given an additional covariate observation, the goal is to choose a decision that minimizes some operational costs. A prevalent issue here is covariate shift, where the marginal distribution of the new covariate differs from historical samples, leading to decision performance variations with nonparametric or parametric estimators. To address this, we propose a distributionally robust approach that uses an ambiguity set by the intersection of two Wasserstein balls, each centered on typical nonparametric or parametric distribution estimators. Computationally, we establish the tractable reformulation of this distributionally robust optimization problem. Statistically, we provide guarantees for our Wasserstein ball intersection approach under covariate shift by analyzing the measure concentration of the estimators. Furthermore, to reduce computational complexity, we employ a surrogate objective that maintains similar generalization guarantees. Through synthetic and empirical case studies on income prediction and portfolio optimization, we demonstrate the strong empirical performance of our proposed models.