PISTO: Proximal Inference for Stochastic Trajectory Optimization

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the instability of updates and the difficulty in handling non-differentiable or discontinuous cost functions in stochastic trajectory optimization by reinterpreting the STOMP algorithm from a variational inference perspective. It introduces a proximal inference framework that incorporates a KL divergence regularizer between successive Gaussian proposal distributions into the objective, yielding a closed-form mean update with a trust-region interpretation. The approach leverages importance-weighted Monte Carlo sampling to estimate expectations, enabling compatibility with arbitrary cost functions without requiring gradients. Experiments demonstrate that the method achieves an 89% success rate in robotic arm planning—outperforming CHOMP (63%) and STOMP (68%)—while producing shorter, smoother trajectories and operating at twice the speed. It also surpasses CEM and MPPI in reward performance on contact-intensive MuJoCo tasks.

📝 Abstract

Stochastic trajectory optimization methods like STOMP enable planning with non-differentiable costs, offering substantial flexibility over gradient-based approaches. We show that STOMP implicitly minimizes the KL divergence from a Boltzmann trajectory distribution, revealing an elegant Variational Inference (VI) structure underlying its updates. Building on this insight, we propose the \textit{Proximal Inference for Stochastic Trajectory Optimization} (PISTO) algorithm that stabilizes the updates by augmenting the objective with a KL regularization between successive Gaussian proposals. This proximal formulation admits a trust-region interpretation and yields closed-form mean updates computable as expectations under a surrogate distribution. We estimate these expectations via importance-weighted Monte Carlo sampling, producing a simple, derivative-free algorithm that inherits STOMP's ability to handle non-differentiable and discontinuous costs without modification. On robot arm motion planning benchmarks, PISTO achieves an 89\% success rate -- outperforming CHOMP (63\%) and STOMP (68\%) -- while producing shorter, smoother paths at twice the speed of competing stochastic methods. We further validate PISTO on contact-rich MuJoCo locomotion and manipulation tasks, where it consistently outperforms both CEM and MPPI baselines in reward.

Problem

Research questions and friction points this paper is trying to address.

Stochastic Trajectory Optimization

Non-differentiable Costs

Update Stability

Motion Planning

Variational Inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proximal Inference

Stochastic Trajectory Optimization

Variational Inference