🤖 AI Summary
This paper addresses stochastic optimal control (SOC) problems under nonlinear, stochastic dynamics with state/control inequality constraints and structural constraints on control policies. We propose a parametric input inference method grounded in the expectation-maximization (EM) framework. A variational inference formulation models state-feedback policies, while barrier functions explicitly enforce hard constraints. Within each EM iteration, trajectory smoothness and control parameter structure are jointly optimized. To our knowledge, this is the first work to apply parametric input inference to structured stochastic controller learning, integrating probabilistic inference, variational optimization, and smoothing techniques. The approach is validated on three benchmark tasks: single-vehicle obstacle avoidance, four-vehicle formation control, and quadrotor wind-resilient navigation. Empirical results quantify the impact of barrier parameters on constraint satisfaction rates and provide a comparative analysis of trade-offs among alternative smoothing strategies.
📝 Abstract
Approximate methods to solve stochastic optimal control (SOC) problems have received significant interest from researchers in the past decade. Probabilistic inference approaches to SOC have been developed to solve nonlinear quadratic Gaussian problems. In this work, we propose an Expectation-Maximization (EM) based inference procedure to generate state-feedback controls for constrained SOC problems. We consider the inequality constraints for the state and controls and also the structural constraints for the controls. We employ barrier functions to address state and control constraints. We show that the expectation step leads to smoothing of the state-control pair while the the maximization step on the non-zero subsets of the control parameters allows inference of structured stochastic optimal controllers. We demonstrate the effectiveness of the algorithm on unicycle obstacle avoidance, four-unicycle formation control, and quadcopter navigation in windy environment examples. In these examples, we perform an empirical study on the parametric effect of barrier functions on the state constraint satisfaction. We also present a comparative study of smoothing algorithms on the performance of the proposed approach.