๐ค AI Summary
This paper addresses stochastic programming (SP) with covariates by proposing a piecewise affine decision rule (PADR) learning framework that directly maps features to optimal decisions. Methodologically, we formulate the PADR-ERM model based on empirical risk minimization and establish, for the first time, its non-asymptotic (unconstrained) and asymptotic (constrained) consistency guarantees. To ensure convergence for nonconvex PADRs, we introduce composite strong directional stationarityโa novel optimality condition tailored to nonsmooth, nonconvex SP. We further design an enhanced stochastic majorization-minimization algorithm for efficient optimization. Experiments demonstrate that our approach significantly reduces both decision cost and computational time across diverse nonconvex SP tasks. It exhibits strong robustness to high-dimensional features and nonlinear covariate dependencies, consistently outperforming state-of-the-art baselines in overall performance.
๐ Abstract
Focusing on stochastic programming (SP) with covariate information, this paper proposes an empirical risk minimization (ERM) method embedded within a nonconvex piecewise affine decision rule (PADR), which aims to learn the direct mapping from features to optimal decisions. We establish the nonasymptotic consistency result of our PADR-based ERM model for unconstrained problems and asymptotic consistency result for constrained ones. To solve the nonconvex and nondifferentiable ERM problem, we develop an enhanced stochastic majorization-minimization algorithm and establish the asymptotic convergence to (composite strong) directional stationarity along with complexity analysis. We show that the proposed PADR-based ERM method applies to a broad class of nonconvex SP problems with theoretical consistency guarantees and computational tractability. Our numerical study demonstrates the superior performance of PADR-based ERM methods compared to state-of-the-art approaches under various settings, with significantly lower costs, less computation time, and robustness to feature dimensions and nonlinearity of the underlying dependency.