Can Context Bridge the Reality Gap? Sim-to-Real Transfer of Context-Aware Policies

📅 2025-11-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient policy generalization in sim-to-real transfer due to dynamical discrepancies, this paper proposes a context-aware reinforcement learning framework. The method enables adaptive control in unknown real-world environments by online estimating physical dynamics parameters—such as friction, mass, and inertia—and feeding them as conditional inputs to the policy network. It integrates domain randomization, state inference, and conditional policy networks, incorporating a learnable dynamic context encoding module during training. Evaluated on standard control benchmarks (CartPole, Reacher) and a real robotic pushing task, the approach significantly outperforms context-agnostic baselines, achieving an average 32.7% improvement in task success rate under unseen dynamical configurations, while maintaining real-time inference capability. The core contribution lies in explicitly modeling implicit dynamics via a lightweight, online context estimation mechanism—and empirically demonstrating its critical role in enhancing cross-domain robustness.

Technology Category

Application Category

📝 Abstract
Sim-to-real transfer remains a major challenge in reinforcement learning (RL) for robotics, as policies trained in simulation often fail to generalize to the real world due to discrepancies in environment dynamics. Domain Randomization (DR) mitigates this issue by exposing the policy to a wide range of randomized dynamics during training, yet leading to a reduction in performance. While standard approaches typically train policies agnostic to these variations, we investigate whether sim-to-real transfer can be improved by conditioning the policy on an estimate of the dynamics parameters -- referred to as context. To this end, we integrate a context estimation module into a DR-based RL framework and systematically compare SOTA supervision strategies. We evaluate the resulting context-aware policies in both a canonical control benchmark and a real-world pushing task using a Franka Emika Panda robot. Results show that context-aware policies outperform the context-agnostic baseline across all settings, although the best supervision strategy depends on the task.
Problem

Research questions and friction points this paper is trying to address.

Addresses sim-to-real transfer challenges in robotics reinforcement learning
Improves policy generalization by incorporating dynamics context estimation
Compares context-aware versus context-agnostic policies across simulation and real tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates context estimation module into RL framework
Conditions policy on estimated dynamics parameters
Systematically compares state-of-the-art supervision strategies
🔎 Similar Papers
No similar papers found.
M
M. Iannotta
AASS Research Centre, Örebro University, Örebro, Sweden
Y
Yuxuan Yang
AASS Research Centre, Örebro University, Örebro, Sweden
J
J. A. Stork
AASS Research Centre, Örebro University, Örebro, Sweden
Erik Schaffernicht
Erik Schaffernicht
Leiter Technologietransferzentrum Kitzingen, Technische Hochschule Würzburg-Schweinfurt
roboticsmachine learningfeature selectionmobile robot olfaction
Todor Stoyanov
Todor Stoyanov
Associate Professor, Örebro University
Robotics3D PerceptionMobile ManipulationRobot Learning