🤖 AI Summary
Achieving high-dynamic parkour for humanoid robots on previously unseen complex terrains remains a significant challenge, as existing general-purpose policies struggle to generalize to extreme environments. This work proposes an end-to-end two-stage learning framework that integrates procedural terrain pretraining, RGB-D–driven high-fidelity real-time geometric reconstruction, and rapid test-time tuning (TTT) to enable immediate adaptation to novel obstacles. The approach substantially enhances zero-shot sim-to-real transfer performance, allowing the robot to successfully traverse diverse challenging structures—including wedges, posts, boxes, trapezoidal platforms, and narrow beams—within ten minutes of deployment. This demonstrates strong robustness and efficient real-world adaptability without requiring extensive retraining or environment-specific fine-tuning.
📝 Abstract
Achieving highly dynamic humanoid parkour on unseen, complex terrains remains a challenge in robotics. Although general locomotion policies demonstrate capabilities across broad terrain distributions, they often struggle with arbitrary and highly challenging environments. To overcome this limitation, we propose a real-to-sim-to-real framework that leverages rapid test-time training (TTT) on novel terrains, significantly enhancing the robot's capability to traverse extremely difficult geometries. We adopt a two-stage end-to-end learning paradigm: a policy is first pre-trained on diverse procedurally generated terrains, followed by rapid fine-tuning on high-fidelity meshes reconstructed from real-world captures. Specifically, we develop a feed-forward, efficient, and high-fidelity geometry reconstruction pipeline using RGB-D inputs, ensuring both speed and quality during test-time training. We demonstrate that TTT-Parkour empowers humanoid robots to master complex obstacles, including wedges, stakes, boxes, trapezoids, and narrow beams. The whole pipeline of capturing, reconstructing, and test-time training requires less than 10 minutes on most tested terrains. Extensive experiments show that the policy after test-time training exhibits robust zero-shot sim-to-real transfer capability.