PolySim: Bridging the Sim-to-Real Gap for Humanoid Control via Multi-Simulator Dynamics Randomization

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

Whole-body control (WBC) policies for humanoid robots trained in a single simulator suffer from inherent inductive biases, leading to substantial performance degradation during sim-to-real transfer. To address this, we propose PolySim—a novel framework enabling parallel, cooperative training across multiple heterogeneous simulators (e.g., MuJoCo, IsaacSim). PolySim systematically mitigates simulator-specific bias through dynamics-level cross-simulator domain randomization, integrating multi-engine ensemble learning, parallel dynamics randomization, and zero-shot transfer learning. Experiments demonstrate significantly reduced inter-simulator motion tracking error; a 52.8× improvement in task success rate over an IsaacSim-only baseline; and zero-shot deployment on the Unitree G1 robot without real-world fine-tuning. This work establishes a new paradigm for achieving robust generalization and efficient sim-to-real transfer in WBC policy learning.

Technology Category

Application Category

📝 Abstract

Humanoid whole-body control (WBC) policies trained in simulation often suffer from the sim-to-real gap, which fundamentally arises from simulator inductive bias, the inherent assumptions and limitations of any single simulator. These biases lead to nontrivial discrepancies both across simulators and between simulation and the real world. To mitigate the effect of simulator inductive bias, the key idea is to train policies jointly across multiple simulators, encouraging the learned controller to capture dynamics that generalize beyond any single simulator's assumptions. We thus introduce PolySim, a WBC training platform that integrates multiple heterogeneous simulators. PolySim can launch parallel environments from different engines simultaneously within a single training run, thereby realizing dynamics-level domain randomization. Theoretically, we show that PolySim yields a tighter upper bound on simulator inductive bias than single-simulator training. In experiments, PolySim substantially reduces motion-tracking error in sim-to-sim evaluations; for example, on MuJoCo, it improves execution success by 52.8 over an IsaacSim baseline. PolySim further enables zero-shot deployment on a real Unitree G1 without additional fine-tuning, showing effective transfer from simulation to the real world. We will release the PolySim code upon acceptance of this work.

Problem

Research questions and friction points this paper is trying to address.

Bridging sim-to-real gap for humanoid control

Mitigating simulator inductive bias effects

Enabling zero-shot real-world policy transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-simulator training reduces sim-to-real gap

Parallel environments enable dynamics-level domain randomization

Zero-shot deployment achieved without real-world fine-tuning

🔎 Similar Papers

Learning Multi-Modal Whole-Body Control for Real-World Humanoid Robots