PHUMA: Physically-Grounded Humanoid Locomotion Dataset

📅 2025-10-30

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Existing methods face a trade-off between scalability and physical plausibility: high-fidelity motion-capture datasets (e.g., AMASS) are limited in scale, while large-scale web-video datasets (e.g., Humanoid-X) suffer from pervasive physical artifacts—floating, interpenetration, and foot slipping. This work introduces the first physics-constrained framework for constructing large-scale, physically plausible humanoid robot motion datasets. Leveraging high-accuracy pose estimation, it extracts motions from internet videos and enforces physical consistency via joint-limit constraints, ground-contact modeling, and slip-resistant motion retargeting. Crucially, it achieves unprecedented scalability *without* compromising physical realism. Experiments demonstrate substantial improvements over Humanoid-X and AMASS on unseen-motion imitation and pelvis trajectory tracking, yielding enhanced policy stability and cross-motion generalization.

Technology Category

Application Category

📝 Abstract

Motion imitation is a promising approach for humanoid locomotion, enabling agents to acquire humanlike behaviors. Existing methods typically rely on high-quality motion capture datasets such as AMASS, but these are scarce and expensive, limiting scalability and diversity. Recent studies attempt to scale data collection by converting large-scale internet videos, exemplified by Humanoid-X. However, they often introduce physical artifacts such as floating, penetration, and foot skating, which hinder stable imitation. In response, we introduce PHUMA, a Physically-grounded HUMAnoid locomotion dataset that leverages human video at scale, while addressing physical artifacts through careful data curation and physics-constrained retargeting. PHUMA enforces joint limits, ensures ground contact, and eliminates foot skating, producing motions that are both large-scale and physically reliable. We evaluated PHUMA in two sets of conditions: (i) imitation of unseen motion from self-recorded test videos and (ii) path following with pelvis-only guidance. In both cases, PHUMA-trained policies outperform Humanoid-X and AMASS, achieving significant gains in imitating diverse motions. The code is available at https://davian-robotics.github.io/PHUMA.

Problem

Research questions and friction points this paper is trying to address.

Addresses physical artifacts in motion imitation from video data

Eliminates floating, penetration, and foot skating in humanoid locomotion

Creates physically reliable motions while maintaining large-scale diversity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages large-scale human videos for data generation

Addresses physical artifacts via physics-constrained retargeting

Ensures ground contact and eliminates foot skating

🔎 Similar Papers

No similar papers found.