🤖 AI Summary
To address low sample efficiency, slow exploration, and heavy reliance on human intervention in real-world robotic reinforcement learning (RL), this paper proposes SimLauncher: a framework that pretrains visuomotor policies in a digital twin simulation environment and generates target-value-guided signals and action suggestions to enable efficient sim-to-real transfer. The method integrates RL with offline data guidance, exploration enhancement mechanisms, and joint optimization using demonstrations from both simulation and real-world sources. Its key innovation lies in repositioning the simulation environment as an *active guidance engine*—rather than a passive pretraining platform—thereby drastically reducing the need for real-world interaction. Evaluated on multi-stage, high-contact, dexterous manipulation tasks, SimLauncher achieves near-100% task success rates and significantly outperforms existing baselines in sample efficiency.
📝 Abstract
Autonomous learning of dexterous, long-horizon robotic skills has been a longstanding pursuit of embodied AI. Recent advances in robotic reinforcement learning (RL) have demonstrated remarkable performance and robustness in real-world visuomotor control tasks. However, applying RL in the real world faces challenges such as low sample efficiency, slow exploration, and significant reliance on human intervention. In contrast, simulators offer a safe and efficient environment for extensive exploration and data collection, while the visual sim-to-real gap, often a limiting factor, can be mitigated using real-to-sim techniques. Building on these, we propose SimLauncher, a novel framework that combines the strengths of real-world RL and real-to-sim-to-real approaches to overcome these challenges. Specifically, we first pre-train a visuomotor policy in the digital twin simulation environment, which then benefits real-world RL in two ways: (1) bootstrapping target values using extensive simulated demonstrations and real-world demonstrations derived from pre-trained policy rollouts, and (2) Incorporating action proposals from the pre-trained policy for better exploration. We conduct comprehensive experiments across multi-stage, contact-rich, and dexterous hand manipulation tasks. Compared to prior real-world RL approaches, SimLauncher significantly improves sample efficiency and achieves near-perfect success rates. We hope this work serves as a proof of concept and inspires further research on leveraging large-scale simulation pre-training to benefit real-world robotic RL.