Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning

📅 2026-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Reinforcement learning policies often fail in simulation-to-reality (sim-to-real) transfer due to modeling inaccuracies. This work systematically investigates how key components of the Markov Decision Process (MDP)—including state representation, objectives, reward function, termination conditions, and dynamics model—affect transfer performance in industrial control tasks. It reveals, for the first time, the underlying mechanisms by which these design choices critically influence sim-to-real success and formulates actionable MDP design principles. Through physically accurate modeling and ablation studies on a color-mixing task, the proposed approach achieves up to 50% success rate in the real world, whereas simplified models completely fail, thereby demonstrating that rigorous MDP design is essential for effective sim-to-real transfer.

Technology Category

Application Category

📝 Abstract
Reinforcement Learning (RL) has demonstrated strong potential for industrial process control, yet policies trained in simulation often suffer from a significant sim-to-real gap when deployed on physical hardware. This work systematically analyzes how core Markov Decision Process (MDP) design choices -- state composition, target inclusion, reward formulation, termination criteria, and environment dynamics models -- affect this transfer. Using a color mixing task, we evaluate different MDP configurations and mixing dynamics across simulation and real-world experiments. We validate our findings on physical hardware, demonstrating that physics-based dynamics models achieve up to 50% real-world success under strict precision constraints where simplified models fail entirely. Our results provide practical MDP design guidelines for deploying RL in industrial process control.
Problem

Research questions and friction points this paper is trying to address.

sim-to-real gap
reinforcement learning
Markov Decision Process
industrial process control
policy transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

Markov Decision Process
sim-to-real transfer
reinforcement learning
physics-based dynamics
industrial process control
🔎 Similar Papers
No similar papers found.
T
Tatjana Krau
Institute of Production and Informatics, 87527 Sonthofen, Germany
J
Jorge Mandlmaier
Institute of Production and Informatics, 87527 Sonthofen, Germany
Tobias Damm
Tobias Damm
Professor for Systems and Control Theory, RPTU Kaiserslautern-Landau, Germany
systems and controlapplied linear algebra
F
Frieder Heieck
Institute of Production and Informatics, 87527 Sonthofen, Germany