Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification

📅 2025-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of deeply integrating model predictive control (MPC) and reinforcement learning (RL), stemming from their fundamentally divergent model usage paradigms. To resolve this, we propose the first unified taxonomy for MPC–RL fusion, centered on *how models are used*, categorizing approaches into three paradigms: MPC-augmented RL, RL-augmented MPC, and co-designed architectures. Leveraging a unified Actor–Critic modeling framework, we systematically analyze how MPC’s online optimization enhances RL’s closed-loop performance and establish a performance-gain-oriented evaluation perspective grounded in closed-loop metrics. The survey comprehensively covers six application domains—including robotics, energy systems, and autonomous driving—and synthesizes cross-cutting modeling techniques bridging control theory and RL. Our work provides a scalable methodology and principled design guidelines for hybrid intelligent control systems.

Technology Category

Application Category

📝 Abstract
The fields of MPC and RL consider two successful control techniques for Markov decision processes. Both approaches are derived from similar fundamental principles, and both are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from diverse communities and different requirements. Various technical discrepancies, particularly the role of an environment model as part of the algorithm, lead to methodologies with nearly complementary advantages. Due to their orthogonal benefits, research interest in combination methods has recently increased significantly, leading to a large and growing set of complex ideas leveraging MPC and RL. This work illuminates the differences, similarities, and fundamentals that allow for different combination algorithms and categorizes existing work accordingly. Particularly, we focus on the versatile actor-critic RL approach as a basis for our categorization and examine how the online optimization approach of MPC can be used to improve the overall closed-loop performance of a policy.
Problem

Research questions and friction points this paper is trying to address.

Combine MPC and RL techniques
Analyze control methodologies differences
Enhance policy performance with MPC
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines MPC and RL
Uses actor-critic RL
Improves closed-loop performance
🔎 Similar Papers
No similar papers found.
Rudolf Reiter
Rudolf Reiter
Postdoctoral Researcher at University of Zurich
Numerical OptimizationModel Predictive ControlReinforcement LearningRoboticsMachine Learning
J
Jasper Hoffmann
Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany
Dirk Reinhardt
Dirk Reinhardt
Postdoctoral Fellow
Optimal Control
Florian Messerer
Florian Messerer
University of Freiburg
Numerical OptimizationOptimal ControlModel Predictive Control
K
Katrin Baumgartner
Department of Microsystems Engineering (IMTEK), University of Freiburg, 79110 Freiburg, Germany
S
Shamburaj Sawant
Department of Engineering Cybernetics, Norwegian University of Science and Technology (NTNU), 7034 Trondheim, Norway
Joschka Boedecker
Joschka Boedecker
Professor of Computer Science, University of Freiburg, Germany
Artificial IntelligenceMachine LearningReinforcement LearningRobotics
M
Moritz Diehl
Department of Microsystems Engineering (IMTEK), University of Freiburg, 79110 Freiburg, Germany
Sebastien Gros
Sebastien Gros
Professor, Eng. Cybernetics, NTNU
Optimal ControlNMPCReinforcement Learning