OmniXtreme: Breaking the Generality Barrier in High-Dynamic Humanoid Control

📅 2026-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of achieving high-fidelity, general-purpose multi-skill execution on real humanoid robots, where motion diversity often compromises tracking accuracy and physical feasibility. To overcome this, the authors propose OmniXtreme, a framework that decouples general motion skill learning from physics-aware execution refinement. First, a high-capacity flow-matching strategy learns a diverse repertoire of motions; subsequently, an actuator-aware sim-to-real refinement stage enhances real-world deployability. This approach breaks the longstanding trade-off between fidelity and scalability, enabling, for the first time, stable execution of multiple extreme, high-difficulty maneuvers on a single real humanoid policy. The method demonstrates strong generalization and robustness by maintaining high tracking fidelity across a challenging motion dataset.

Technology Category

Application Category

📝 Abstract
High-fidelity motion tracking serves as the ultimate litmus test for generalizable, human-level motor skills. However, current policies often hit a"generality barrier": as motion libraries scale in diversity, tracking fidelity inevitably collapses - especially for real-world deployment of high-dynamic motions. We identify this failure as the result of two compounding factors: the learning bottleneck in scaling multi-motion optimization and the physical executability constraints that arise in real-world actuation. To overcome these challenges, we introduce OmniXtreme, a scalable framework that decouples general motor skill learning from sim-to-real physical skill refinement. Our approach uses a flow-matching policy with high-capacity architectures to scale representation capacity without interference-intensive multi-motion RL optimization, followed by an actuation-aware refinement phase that ensures robust performance on physical hardware. Extensive experiments demonstrate that OmniXtreme maintains high-fidelity tracking across diverse, high-difficulty datasets. On real robots, the unified policy successfully executes multiple extreme motions, effectively breaking the long-standing fidelity-scalability trade-off in high-dynamic humanoid control.
Problem

Research questions and friction points this paper is trying to address.

humanoid control
motion tracking
generality barrier
high-dynamic motions
sim-to-real
Innovation

Methods, ideas, or system contributions that make the work stand out.

flow-matching policy
sim-to-real refinement
actuation-aware control
high-dynamic humanoid control
scalable motor skill learning
🔎 Similar Papers
No similar papers found.
Y
Yunshen Wang
State Key Laboratory of General Artificial Intelligence, Beijing Institute for General Artificial Intelligence (BIGAI); Joint Laboratory of Embodied AI and Humanoid Robots, BIGAI & Unitree Robotics; Shanghai Jiao Tong University
S
Shaohang Zhu
State Key Laboratory of General Artificial Intelligence, Beijing Institute for General Artificial Intelligence (BIGAI); Joint Laboratory of Embodied AI and Humanoid Robots, BIGAI & Unitree Robotics; University of Science and Technology of China
Peiyuan Zhi
Peiyuan Zhi
Unknown affiliation
Yuhan Li
Yuhan Li
PhD Student in School of Mathematical Sciences, Queen Mary University of London
Network ScieneDigital EpidemiologyComplex Networks
J
Jiaxin Li
State Key Laboratory of General Artificial Intelligence, Beijing Institute for General Artificial Intelligence (BIGAI); Joint Laboratory of Embodied AI and Humanoid Robots, BIGAI & Unitree Robotics; Beijing Institute of Technology
Yong-Lu Li
Yong-Lu Li
Associate Professor, Shanghai Jiao Tong University/Shanghai Innovation Institute
Physical ReasoningRoboticsComputer VisionMachine LearningEmbodied AI
Yuchen Xiao
Yuchen Xiao
Lead of Embodied AI R&D, Unitree | Research Scientist, J.P. Morgan | Ph.D. Northeastern University
Generative ModelsRobot LearningReinforcement LearningMulti-Agent Systems
X
Xingxing Wang
Unitree Robotics
Baoxiong Jia
Baoxiong Jia
Ph.D. in Computer Science, UCLA
Computer VisionArtificial Intelligence
Siyuan Huang
Siyuan Huang
Beijing Institute for General Artificial Intelligence (BIGAI)
Embodied AI3D VisionRobotics3D Scene Understanding