🤖 AI Summary
This paper addresses two key challenges in contextual stochastic combinatorial optimization: insufficient exploitation of contextual information and difficulty in minimizing empirical risk. To tackle these, we propose a novel framework integrating operations research with machine learning. Methodologically, we design a neural network architecture incorporating a combinatorial layer, introduce a new regularization technique based on sparse simplex perturbations, and extend the Fenchel–Young loss theory to derive a generic primal-dual algorithm. Theoretically, we establish the first linear convergence guarantee and a non-optimality risk bound for this problem class. Empirically, our method achieves efficient scalability under linear convergence, matches the performance of computationally expensive Lagrangian-based imitation learning on contextual stochastic minimum spanning tree tasks, and significantly enhances decision robustness and generalization under uncertainty.
📝 Abstract
This paper introduces a novel approach to contextual stochastic optimization, integrating operations research and machine learning to address decision-making under uncertainty. Traditional methods often fail to leverage contextual information, which underscores the necessity for new algorithms. In this study, we utilize neural networks with combinatorial optimization layers to encode policies. Our goal is to minimize the empirical risk, which is estimated from past data on uncertain parameters and contexts. To that end, we present a surrogate learning problem and a generic primal-dual algorithm that is applicable to various combinatorial settings in stochastic optimization. Our approach extends classic Fenchel-Young loss results and introduces a new regularization method using sparse perturbations on the distribution simplex. This allows for tractable updates in the original space and can accommodate diverse objective functions. We demonstrate the linear convergence of our algorithm under certain conditions and provide a bound on the non-optimality of the resulting policy in terms of the empirical risk. Experiments on a contextual stochastic minimum weight spanning tree problem show that our algorithm is efficient and scalable, achieving performance comparable to imitation learning of solutions computed using an expensive Lagrangian-based heuristic.