🤖 AI Summary
This work addresses the zero-shot transfer challenge of general locomotion policies across morphologically diverse legged robots—including previously unseen real-world humanoid and quadruped platforms. We propose URMAv2, an embodied perception architecture that integrates extreme morphological randomization, performance-guided curriculum learning, and deep reinforcement learning to train a single unified policy across 50 heterogeneous legged morphologies. The framework enables efficient exploration of a million-scale morphological variation space and achieves, for the first time, end-to-end policy training across multiple robot morphologies with direct plug-and-play deployment on physical systems. Experiments demonstrate that the learned policy delivers high-performance walking control on unseen real-world robots, significantly improving generalization, robustness, and deployment efficiency. URMAv2 establishes a scalable new paradigm for universal locomotion control in embodied intelligence.
📝 Abstract
We present a single, general locomotion policy trained on a diverse collection of 50 legged robots. By combining an improved embodiment-aware architecture (URMAv2) with a performance-based curriculum for extreme Embodiment Randomization, our policy learns to control millions of morphological variations. Our policy achieves zero-shot transfer to unseen real-world humanoid and quadruped robots.