Mean-Field Generalisation Bounds for Learning Controls in Stochastic Environments

📅 2025-08-21

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work addresses data-driven discrete-time stochastic control under overparameterized settings, focusing on ensuring stability guarantees for finite-sample learning. The proposed method introduces a modeling framework that exploits the intrinsic structure of the uncontrolled system dynamics, reformulating the original problem as a sequence of infinite-dimensional optimization problems. It integrates dynamic programming principles with the mean-field interpretation of single-layer neural networks. The key contribution is the derivation of a verifiable, regularized non-asymptotic generalization error bound, which— for the first time—rigorously characterizes the convergence and robustness relationship between stochastic gradient descent solutions and optimal control policies in the stochastic setting. Empirical evaluation across multiple canonical stochastic control tasks demonstrates that the approach significantly enhances the generalization capability and closed-loop stability of learned policies under limited data.

Technology Category

Application Category

📝 Abstract

We consider a data-driven formulation of the classical discrete-time stochastic control problem. Our approach exploits the natural structure of many such problems, in which significant portions of the system are uncontrolled. Employing the dynamic programming principle and the mean-field interpretation of single-hidden layer neural networks, we formulate the control problem as a series of infinite-dimensional minimisation problems. When regularised carefully, we provide practically verifiable assumptions for non-asymptotic bounds on the generalisation error achieved by the minimisers to this problem, thus ensuring stability in overparametrised settings, for controls learned using finitely many observations. We explore connections to the traditional noisy stochastic gradient descent algorithm, and subsequently show promising numerical results for some classic control problems.

Problem

Research questions and friction points this paper is trying to address.

Learning controls in stochastic environments with data

Providing non-asymptotic bounds on generalization error

Ensuring stability in overparametrized control settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mean-field interpretation of neural networks

Infinite-dimensional minimization via dynamic programming

Non-asymptotic generalization bounds with regularization

🔎 Similar Papers

No similar papers found.