Diffusion Model Predictive Control

📅 2024-10-07
🏛️ arXiv.org
📈 Citations: 9
Influential: 2
📄 PDF
🤖 AI Summary
To address the challenge that model predictive control (MPC) in offline reinforcement learning struggles to adapt to novel reward functions and non-stationary dynamics, this paper proposes Diffusion Model Predictive Control (D-MPC). D-MPC is the first method to unify diffusion models for both multi-step action generation and multi-step dynamics modeling, jointly optimizing both components within online MPC. It enables zero-shot reward reconfiguration and dynamic adaptation, eliminating reliance on pre-specified rewards or fixed dynamics models. Technically, it integrates diffusion-based modeling, uncertainty-aware sequential generation, and model-based planning. On the D4RL benchmark, D-MPC significantly outperforms state-of-the-art model-based offline planning methods such as MBOP, while matching the performance of current top-tier model-based and model-free RL algorithms.

Technology Category

Application Category

📝 Abstract
We propose Diffusion Model Predictive Control (D-MPC), a novel MPC approach that learns a multi-step action proposal and a multi-step dynamics model, both using diffusion models, and combines them for use in online MPC. On the popular D4RL benchmark, we show performance that is significantly better than existing model-based offline planning methods using MPC (e.g. MBOP) and competitive with state-of-the-art (SOTA) model-based and model-free reinforcement learning methods. We additionally illustrate D-MPC's ability to optimize novel reward functions at run time and adapt to novel dynamics, and highlight its advantages compared to existing diffusion-based planning baselines.
Problem

Research questions and friction points this paper is trying to address.

Develops Diffusion Model Predictive Control for action and dynamics modeling
Improves performance over existing model-based offline planning methods
Enables runtime reward optimization and novel dynamics adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion models for action proposals
Combines diffusion models with MPC
Optimizes novel rewards at runtime
🔎 Similar Papers
No similar papers found.
Guangyao Zhou
Guangyao Zhou
Senior Research Scientist, Google DeepMind
Sivaramakrishnan Swaminathan
Sivaramakrishnan Swaminathan
Google DeepMind
Rajkumar Vasudeva Raju
Rajkumar Vasudeva Raju
Google DeepMind
J
J. S. Guntupalli
Google DeepMind
W
Wolfgang Lehrach
Google DeepMind
Joseph Ortiz
Joseph Ortiz
Research Scientist, Google DeepMind
Machine learningComputer VisionRobotics
A
A. Dedieu
Google DeepMind
M
Miguel L'azaro-Gredilla
Google DeepMind
K
Kevin Murphy
Google DeepMind