Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning

📅 2026-03-10

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Reinforcement learning policies often fail in simulation-to-reality (sim-to-real) transfer due to modeling inaccuracies. This work systematically investigates how key components of the Markov Decision Process (MDP)—including state representation, objectives, reward function, termination conditions, and dynamics model—affect transfer performance in industrial control tasks. It reveals, for the first time, the underlying mechanisms by which these design choices critically influence sim-to-real success and formulates actionable MDP design principles. Through physically accurate modeling and ablation studies on a color-mixing task, the proposed approach achieves up to 50% success rate in the real world, whereas simplified models completely fail, thereby demonstrating that rigorous MDP design is essential for effective sim-to-real transfer.

Technology Category

Application Category

📝 Abstract

Reinforcement Learning (RL) has demonstrated strong potential for industrial process control, yet policies trained in simulation often suffer from a significant sim-to-real gap when deployed on physical hardware. This work systematically analyzes how core Markov Decision Process (MDP) design choices -- state composition, target inclusion, reward formulation, termination criteria, and environment dynamics models -- affect this transfer. Using a color mixing task, we evaluate different MDP configurations and mixing dynamics across simulation and real-world experiments. We validate our findings on physical hardware, demonstrating that physics-based dynamics models achieve up to 50% real-world success under strict precision constraints where simplified models fail entirely. Our results provide practical MDP design guidelines for deploying RL in industrial process control.

Problem

Research questions and friction points this paper is trying to address.

sim-to-real gap

reinforcement learning

Markov Decision Process

industrial process control

policy transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

Markov Decision Process

sim-to-real transfer

reinforcement learning