🤖 AI Summary
This paper addresses the optimal feedback control problem for stochastic control-affine systems with unknown nonlinear dynamics and stage-cost functions, leveraging only known control penalty terms and constraints. We propose a fully data-driven framework that innovatively applies kernel mean embedding (KME) to nonparametrically identify the Markov transition operator of controlled diffusion processes, and integrates convex operator theory to reformulate the Hamilton–Jacobi–Bellman (HJB) equation—thereby circumventing the curse of dimensionality inherent in classical dynamic programming. Our approach operates entirely within reproducing kernel Hilbert spaces, utilizing kernel methods and convex optimization without requiring model assumptions or function approximators. Evaluated on multiple high-dimensional nonlinear stochastic systems, the method demonstrates superior data efficiency and scalability. It establishes a novel paradigm for real-time optimal control of black-box systems.
📝 Abstract
This paper proposes a fully data-driven approach for optimal control of nonlinear control-affine systems represented by a stochastic diffusion. The focus is on the scenario where both the nonlinear dynamics and stage cost functions are unknown, while only control penalty function and constraints are provided. Leveraging the theory of reproducing kernel Hilbert spaces, we introduce novel kernel mean embeddings (KMEs) to identify the Markov transition operators associated with controlled diffusion processes. The KME learning approach seamlessly integrates with modern convex operator-theoretic Hamilton-Jacobi-Bellman recursions. Thus, unlike traditional dynamic programming methods, our approach exploits the ``kernel trick'' to break the curse of dimensionality. We demonstrate the effectiveness of our method through numerical examples, highlighting its ability to solve a large class of nonlinear optimal control problems.