Towards Adaptive Humanoid Control via Multi-Behavior Distillation and Reinforced Fine-Tuning

📅 2025-11-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing humanoid robot motion control approaches rely on single-task specialized policies, exhibiting poor generalization and limited environmental adaptability. This paper proposes a two-stage framework integrating multi-behavior distillation and online reinforcement fine-tuning: first, knowledge distillation fuses diverse locomotion policies—such as standing up, walking, running, and jumping—into a unified foundational controller; second, terrain-adaptive reinforcement fine-tuning is performed online using real-time sensory feedback. To our knowledge, this is the first work to synergistically combine multi-policy distillation with real-time reinforcement fine-tuning, significantly enhancing robustness and generalization across skill transitions and complex terrains—including slopes, gravel, and stairs. Extensive evaluations in simulation and on a physical Unitree G1 robot demonstrate stable execution of diverse locomotion tasks, with the proposed controller achieving substantially superior generality and deployment reliability compared to baseline methods.

Technology Category

Application Category

📝 Abstract
Humanoid robots are promising to learn a diverse set of human-like locomotion behaviors, including standing up, walking, running, and jumping. However, existing methods predominantly require training independent policies for each skill, yielding behavior-specific controllers that exhibit limited generalization and brittle performance when deployed on irregular terrains and in diverse situations. To address this challenge, we propose Adaptive Humanoid Control (AHC) that adopts a two-stage framework to learn an adaptive humanoid locomotion controller across different skills and terrains. Specifically, we first train several primary locomotion policies and perform a multi-behavior distillation process to obtain a basic multi-behavior controller, facilitating adaptive behavior switching based on the environment. Then, we perform reinforced fine-tuning by collecting online feedback in performing adaptive behaviors on more diverse terrains, enhancing terrain adaptability for the controller. We conduct experiments in both simulation and real-world experiments in Unitree G1 robots. The results show that our method exhibits strong adaptability across various situations and terrains. Project website: https://ahc-humanoid.github.io.
Problem

Research questions and friction points this paper is trying to address.

Training independent policies for each humanoid skill limits generalization
Behavior-specific controllers perform poorly on irregular terrains
Existing methods lack adaptability across diverse situations and skills
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-behavior distillation enables adaptive locomotion switching
Reinforced fine-tuning enhances terrain adaptability
Two-stage framework integrates diverse skills and terrains
🔎 Similar Papers
No similar papers found.
Y
Yingnan Zhao
College of Computer Science and Technology, Harbin Engineering University
X
Xinmiao Wang
College of Computer Science and Technology, Harbin Engineering University
Dewei Wang
Dewei Wang
USTC
Robotics
Xinzhe Liu
Xinzhe Liu
Shanghaitech University
Robotics
Dan Lu
Dan Lu
College of Computer Science and Technology, Harbin Engineering University
Q
Qilong Han
College of Computer Science and Technology, Harbin Engineering University
P
Peng Liu
College of Computer Science and Technology, Harbin Institute of Technology
Chenjia Bai
Chenjia Bai
Institute of Artificial Intelligence, China Telecom(中国电信人工智能研究院, TeleAI)
Reinforcement LearningRoboticsEmbodied AI