Towards Adaptive Humanoid Control via Multi-Behavior Distillation and Reinforced Fine-Tuning

📅 2025-11-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

236K/year
🤖 AI Summary
Existing humanoid robot motion control approaches rely on single-task specialized policies, exhibiting poor generalization and limited environmental adaptability. This paper proposes a two-stage framework integrating multi-behavior distillation and online reinforcement fine-tuning: first, knowledge distillation fuses diverse locomotion policies—such as standing up, walking, running, and jumping—into a unified foundational controller; second, terrain-adaptive reinforcement fine-tuning is performed online using real-time sensory feedback. To our knowledge, this is the first work to synergistically combine multi-policy distillation with real-time reinforcement fine-tuning, significantly enhancing robustness and generalization across skill transitions and complex terrains—including slopes, gravel, and stairs. Extensive evaluations in simulation and on a physical Unitree G1 robot demonstrate stable execution of diverse locomotion tasks, with the proposed controller achieving substantially superior generality and deployment reliability compared to baseline methods.

Technology Category

Application Category

📝 Abstract
Humanoid robots are promising to learn a diverse set of human-like locomotion behaviors, including standing up, walking, running, and jumping. However, existing methods predominantly require training independent policies for each skill, yielding behavior-specific controllers that exhibit limited generalization and brittle performance when deployed on irregular terrains and in diverse situations. To address this challenge, we propose Adaptive Humanoid Control (AHC) that adopts a two-stage framework to learn an adaptive humanoid locomotion controller across different skills and terrains. Specifically, we first train several primary locomotion policies and perform a multi-behavior distillation process to obtain a basic multi-behavior controller, facilitating adaptive behavior switching based on the environment. Then, we perform reinforced fine-tuning by collecting online feedback in performing adaptive behaviors on more diverse terrains, enhancing terrain adaptability for the controller. We conduct experiments in both simulation and real-world experiments in Unitree G1 robots. The results show that our method exhibits strong adaptability across various situations and terrains. Project website: https://ahc-humanoid.github.io.
Problem

Research questions and friction points this paper is trying to address.

Training independent policies for each humanoid skill limits generalization
Behavior-specific controllers perform poorly on irregular terrains
Existing methods lack adaptability across diverse situations and skills
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-behavior distillation enables adaptive locomotion switching
Reinforced fine-tuning enhances terrain adaptability
Two-stage framework integrates diverse skills and terrains
🔎 Similar Papers
2024-05-28International Conference on Learning RepresentationsCitations: 10