π€ AI Summary
To address the insufficient robustness of locomotion policies in sim-to-real transfer for humanoid robots, this paper proposes a reinforcement learning framework based on learnable adversarial attacks. The method employs an end-to-end trainable adversarial perturbation network that dynamically identifies and attacks vulnerable policy states, thereby driving the policy to proactively adapt to worst-case disturbances within simulation and enhancing its robustness against modeling errors and environmental uncertainties. Our key innovation lies in embedding adversarial training into the full-body motion control loop, integrating perception-guided trajectory tracking with sim-to-real co-optimization. Evaluation on the Unitree G1 platform demonstrates that the proposed approach significantly narrows the sim-to-real performance gap: it substantially improves stability and robustness in real-world deployment, particularly for complex terrain traversal and high-agility full-body trajectory tracking tasks.
π Abstract
Humanoid robots show significant potential in daily tasks. However, reinforcement learning-based motion policies often suffer from robustness degradation due to the sim-to-real dynamics gap, thereby affecting the agility of real robots. In this work, we propose a novel robust adversarial training paradigm designed to enhance the robustness of humanoid motion policies in real worlds. The paradigm introduces a learnable adversarial attack network that precisely identifies vulnerabilities in motion policies and applies targeted perturbations, forcing the motion policy to enhance its robustness against perturbations through dynamic adversarial training. We conduct experiments on the Unitree G1 humanoid robot for both perceptive locomotion and whole-body control tasks. The results demonstrate that our proposed method significantly enhances the robot's motion robustness in real world environments, enabling successful traversal of challenging terrains and highly agile whole-body trajectory tracking.