Discrete-Time Distribution Steering using Monte Carlo Tree Search

๐Ÿ“… 2024-12-09
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the problem of precisely steering state distributions in nonlinear dynamical systems. Methodologically, it formulates distributional control as a discrete-time Markov decision process (MDP) with explicit state-distribution constraints, where state-feedback policies serve as the action space. A novel, differentiable, and computationally efficient distribution distance metric is introduced, andโ€”cruciallyโ€”Monte Carlo tree search (MCTS) is extended for the first time to distribution-guided control under arbitrary (including strongly nonlinear) dynamics, eliminating reliance on linearization approximations. Experiments across diverse linear and nonlinear systems demonstrate that the proposed framework achieves significantly higher fidelity in matching target distributions, exhibits robust performance, and consistently outperforms baseline methods based on linearization or moment-matching techniques.

Technology Category

Application Category

๐Ÿ“ Abstract
Optimal control problems with state distribution constraints have attracted interest for their expressivity, but solutions rely on linear approximations. We approach the problem of driving the state of a dynamical system in distribution from a sequential decision-making perspective. We formulate the optimal control problem as an appropriate Markov decision process (MDP), where the actions correspond to the state-feedback control policies. We then solve the MDP using Monte Carlo tree search (MCTS). This renders our method suitable for any dynamics model. A key component of our approach is a novel, easy to compute, distance metric in the distribution space that allows our algorithm to guide the distribution of the state. We experimentally test our algorithm under both linear and nonlinear dynamics.
Problem

Research questions and friction points this paper is trying to address.

Computing similarity between probability distributions in control
Introducing interpretable distance based on cumulative distribution functions
Applying gradient-based solutions to distribution steering and ergodic control
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses cumulative distribution functions of linear projections
Computes interpretable and differentiable distribution distances
Applies gradient-based solutions to control problems
๐Ÿ”Ž Similar Papers
No similar papers found.