H-Zero: Cross-Humanoid Locomotion Pretraining Enables Few-shot Novel Embodiment Transfer

📅 2025-11-30

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Existing robot controllers exhibit poor generalization, requiring labor-intensive reward shaping, physical parameter tuning, and hyperparameter optimization for each humanoid platform, thereby hindering cross-morphology transfer. Method: We propose H-Zero, the first pre-training framework for universal bipedal locomotion policies across diverse humanoid morphologies. It integrates deep reinforcement learning with cross-morphology policy distillation, trained jointly in a multi-robot simulation environment and fine-tuned with minimal real-world data. Contribution/Results: H-Zero enables zero-shot or few-shot adaptation—new morphologies can be deployed within 30 minutes. On unseen humanoid robots, it achieves an 81% gait-cycle retention rate. Moreover, it generalizes effectively to upright quadrupeds, demonstrating strong cross-locomotion adaptability. By drastically reducing reliance on hardware-specific tuning and expert intervention, H-Zero significantly lowers deployment overhead and parameter optimization costs.

Technology Category

Application Category

📝 Abstract

The rapid advancement of humanoid robotics has intensified the need for robust and adaptable controllers to enable stable and efficient locomotion across diverse platforms. However, developing such controllers remains a significant challenge because existing solutions are tailored to specific robot designs, requiring extensive tuning of reward functions, physical parameters, and training hyperparameters for each embodiment. To address this challenge, we introduce H-Zero, a cross-humanoid locomotion pretraining pipeline that learns a generalizable humanoid base policy. We show that pretraining on a limited set of embodiments enables zero-shot and few-shot transfer to novel humanoid robots with minimal fine-tuning. Evaluations show that the pretrained policy maintains up to 81% of the full episode duration on unseen robots in simulation while enabling few-shot transfer to unseen humanoids and upright quadrupeds within 30 minutes of fine-tuning.

Problem

Research questions and friction points this paper is trying to address.

Enables few-shot transfer to novel humanoid robots

Learns a generalizable locomotion policy across embodiments

Reduces need for extensive tuning per robot design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-humanoid pretraining for generalizable locomotion policy

Zero-shot and few-shot transfer to novel humanoid robots

Minimal fine-tuning enables adaptation within 30 minutes

🔎 Similar Papers

Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation

2024-06-20arXiv.orgCitations: 3

Nvidia

base salary range is 168,000 USD - 264,500 USD for Level 3, and 192,000 USD - 304,750 USD for Level 4.

US, CA, Santa Clara / Remote - US

Research Scientist Intern, Robotic Control Policy (PhD)