Discrete-Time Distribution Steering using Monte Carlo Tree Search

📅 2024-12-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This work addresses the problem of precisely steering state distributions in nonlinear dynamical systems. Methodologically, it formulates distributional control as a discrete-time Markov decision process (MDP) with explicit state-distribution constraints, where state-feedback policies serve as the action space. A novel, differentiable, and computationally efficient distribution distance metric is introduced, and—crucially—Monte Carlo tree search (MCTS) is extended for the first time to distribution-guided control under arbitrary (including strongly nonlinear) dynamics, eliminating reliance on linearization approximations. Experiments across diverse linear and nonlinear systems demonstrate that the proposed framework achieves significantly higher fidelity in matching target distributions, exhibits robust performance, and consistently outperforms baseline methods based on linearization or moment-matching techniques.

Technology Category

Application Category

📝 Abstract

Optimal control problems with state distribution constraints have attracted interest for their expressivity, but solutions rely on linear approximations. We approach the problem of driving the state of a dynamical system in distribution from a sequential decision-making perspective. We formulate the optimal control problem as an appropriate Markov decision process (MDP), where the actions correspond to the state-feedback control policies. We then solve the MDP using Monte Carlo tree search (MCTS). This renders our method suitable for any dynamics model. A key component of our approach is a novel, easy to compute, distance metric in the distribution space that allows our algorithm to guide the distribution of the state. We experimentally test our algorithm under both linear and nonlinear dynamics.

Problem

Research questions and friction points this paper is trying to address.

Computing similarity between probability distributions in control

Introducing interpretable distance based on cumulative distribution functions

Applying gradient-based solutions to distribution steering and ergodic control

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses cumulative distribution functions of linear projections

Computes interpretable and differentiable distribution distances

Applies gradient-based solutions to control problems

🔎 Similar Papers

Real-time Motion Planning for autonomous vehicles in dynamic environments