KungfuBot2: Learning Versatile Motion Skills for Humanoid Whole-Body Control

📅 2025-09-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of humanoid robots struggling to stably acquire diverse whole-body motor skills under a single policy, this paper proposes a unified whole-body control framework. The method introduces hybrid tracking objectives that jointly optimize local motion fidelity and global trajectory consistency; designs an Orthogonal Mixture-of-Experts (OMoE) architecture to enable synergistic skill specialization and generalization; and incorporates segment-level tracking rewards to enhance robustness against global displacement errors and transient deviations. Evaluated in both simulation and on real humanoid hardware, the approach achieves minute-scale stable control, enables high-fidelity dynamic motion imitation, and demonstrates strong generalization to unseen motions. This work establishes a scalable, highly robust end-to-end learning paradigm for general-purpose humanoid robots.

Technology Category

Application Category

📝 Abstract
Learning versatile whole-body skills by tracking various human motions is a fundamental step toward general-purpose humanoid robots. This task is particularly challenging because a single policy must master a broad repertoire of motion skills while ensuring stability over long-horizon sequences. To this end, we present VMS, a unified whole-body controller that enables humanoid robots to learn diverse and dynamic behaviors within a single policy. Our framework integrates a hybrid tracking objective that balances local motion fidelity with global trajectory consistency, and an Orthogonal Mixture-of-Experts (OMoE) architecture that encourages skill specialization while enhancing generalization across motions. A segment-level tracking reward is further introduced to relax rigid step-wise matching, enhancing robustness when handling global displacements and transient inaccuracies. We validate VMS extensively in both simulation and real-world experiments, demonstrating accurate imitation of dynamic skills, stable performance over minute-long sequences, and strong generalization to unseen motions. These results highlight the potential of VMS as a scalable foundation for versatile humanoid whole-body control. The project page is available at https://kungfubot2-humanoid.github.io.
Problem

Research questions and friction points this paper is trying to address.

Learning versatile whole-body skills for humanoid robots
Mastering broad motion repertoire while ensuring long-term stability
Achieving robust imitation of diverse human motions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid tracking objective balancing motion fidelity
Orthogonal Mixture-of-Experts architecture for skill specialization
Segment-level tracking reward enhancing robustness generalization
🔎 Similar Papers
J
Jinrui Han
Institute of Artificial Intelligence (TeleAI), China Telecom
W
Weiji Xie
Institute of Artificial Intelligence (TeleAI), China Telecom
Jiakun Zheng
Jiakun Zheng
The Hong Kong University of Science and Technology
Computing-in-MemoryAI Accelerator
Jiyuan Shi
Jiyuan Shi
Tsinghua University
Reinforcement LearningRobotics
W
Weinan Zhang
Shanghai Jiao Tong University
T
Ting Xiao
East China University of Science and Technology
Chenjia Bai
Chenjia Bai
Institute of Artificial Intelligence, China Telecom(中国电信人工智能研究院, TeleAI)
Reinforcement LearningRoboticsEmbodied AI