🤖 AI Summary
To address the challenge of humanoid robots struggling to stably acquire diverse whole-body motor skills under a single policy, this paper proposes a unified whole-body control framework. The method introduces hybrid tracking objectives that jointly optimize local motion fidelity and global trajectory consistency; designs an Orthogonal Mixture-of-Experts (OMoE) architecture to enable synergistic skill specialization and generalization; and incorporates segment-level tracking rewards to enhance robustness against global displacement errors and transient deviations. Evaluated in both simulation and on real humanoid hardware, the approach achieves minute-scale stable control, enables high-fidelity dynamic motion imitation, and demonstrates strong generalization to unseen motions. This work establishes a scalable, highly robust end-to-end learning paradigm for general-purpose humanoid robots.
📝 Abstract
Learning versatile whole-body skills by tracking various human motions is a fundamental step toward general-purpose humanoid robots. This task is particularly challenging because a single policy must master a broad repertoire of motion skills while ensuring stability over long-horizon sequences. To this end, we present VMS, a unified whole-body controller that enables humanoid robots to learn diverse and dynamic behaviors within a single policy. Our framework integrates a hybrid tracking objective that balances local motion fidelity with global trajectory consistency, and an Orthogonal Mixture-of-Experts (OMoE) architecture that encourages skill specialization while enhancing generalization across motions. A segment-level tracking reward is further introduced to relax rigid step-wise matching, enhancing robustness when handling global displacements and transient inaccuracies. We validate VMS extensively in both simulation and real-world experiments, demonstrating accurate imitation of dynamic skills, stable performance over minute-long sequences, and strong generalization to unseen motions. These results highlight the potential of VMS as a scalable foundation for versatile humanoid whole-body control. The project page is available at https://kungfubot2-humanoid.github.io.