Convergence Rate of a Functional Learning Method for Contextual Stochastic Optimization

📅 2026-03-13

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses nonlinear stochastic optimization problems involving conditional expectations when direct sampling from the conditional distribution $Y|X$ is infeasible. Leveraging an i.i.d. stream of observational data $\{(X^k, Y^k)\}$, the authors propose a joint learning-and-optimization algorithm that simultaneously approximates the conditional expectation within a parameterized function class and optimizes the outer objective. By integrating techniques from function approximation, stochastic approximation, and nonlinear functional optimization—all under the minimal assumption of access only to observed data pairs—the method establishes, for the first time, a theoretical convergence rate of $O(1/\sqrt{N})$. This provides a theoretically grounded and practically viable solution to real-world problems where conditional sampling is unavailable.

Technology Category

Application Category

📝 Abstract

We consider a stochastic optimization problem involving two random variables: a context variable $X$ and a dependent variable $Y$. The objective is to minimize the expected value of a nonlinear loss functional applied to the conditional expectation $\mathbb{E}[f(X, Y,β) \mid X]$, where $f$ is a nonlinear function and $β$ represents the decision variables. We focus on the practically important setting in which direct sampling from the conditional distribution of $Y \mid X$ is infeasible, and only a stream of i.i.d.\ observation pairs $\{(X^k, Y^k)\}_{k=0,1,2,\ldots}$ is available. In our approach, the conditional expectation is approximated within a prespecified parametric function class. We analyze a simultaneous learning-and-optimization algorithm that jointly estimates the conditional expectation and optimizes the outer objective, and establish that the method achieves a convergence rate of order $\mathcal{O}\big(1/\sqrt{N}\big)$, where $N$ denotes the number of observed pairs.

Problem

Research questions and friction points this paper is trying to address.

stochastic optimization

conditional expectation

contextual optimization

nonlinear loss functional

i.i.d. observations

Innovation

Methods, ideas, or system contributions that make the work stand out.

contextual stochastic optimization

conditional expectation approximation

simultaneous learning-and-optimization