🤖 AI Summary
This work addresses the lack of standardized pipelines in sim-to-real transfer for humanoid robot reinforcement learning, which often undermines deployment reliability. To this end, we propose the first end-to-end reinforcement learning workflow tailored for humanoids, integrating four key stages: interactive environment validation, reproducible training, joint evaluation, and descriptor-driven deployment. The framework incorporates automated regression testing and mechanisms to enhance training stability, significantly improving both reproducibility and sim-to-real transfer success rates. We demonstrate stable real-world deployment of five distinct loco-manipulation skills—including locomotion, recovery, and imitation—on the Unitree G1 and Booster T1 platforms, validating the effectiveness and robustness of the proposed approach.
📝 Abstract
Recent advances in reinforcement learning (RL) have enabled impressive humanoid behaviors in simulation, yet transferring these results to new robots remains challenging. In many real deployments, the primary bottleneck is no longer simulation throughput or algorithm design, but the absence of systematic infrastructure that links environment verification, training, evaluation, and deployment in a coherent loop.
To address this gap, we present AGILE, an end-to-end workflow for humanoid RL that standardizes the policy-development lifecycle to mitigate common sim-to-real failure modes. AGILE comprises four stages: (1) interactive environment verification, (2) reproducible training, (3) unified evaluation, and (4) descriptor-driven deployment via robot/task configuration descriptors. For evaluation stage, AGILE supports both scenario-based tests and randomized rollouts under a shared suite of motion-quality diagnostics, enabling automated regression testing and principled robustness assessment. AGILE also incorporates a set of training stabilizations and algorithmic enhancements in training stage to improve optimization stability and sim-to-real transfer.
With this pipeline in place, we validate AGILE across five representative humanoid skills spanning locomotion, recovery, motion imitation, and loco-manipulation on two hardware platforms (Unitree G1 and Booster T1), achieving consistent sim-to-real transfer. Overall, AGILE shows that a standardized, end-to-end workflow can substantially improve the reliability and reproducibility of humanoid RL development.