🤖 AI Summary
Existing legged robot control frameworks lack generalization across diverse morphologies (e.g., quadrupedal, bipedal, hexapedal) and exhibit no zero-shot or few-shot cross-morphology transfer capability. To address this, we propose URMA—a Unified Reinforcement Learning Architecture—that introduces a morphology-agnostic encoder-decoder structure. URMA employs morphology-aware input normalization, a shared latent-space policy network, and plug-and-play morphology-adaptation modules, enabling end-to-end single-policy control of heterogeneous legged robots. It is the first work to extend multi-task reinforcement learning to cross-morphology general locomotion control, advancing embodied foundation models. Extensive evaluation in simulation and on real robots demonstrates that URMA achieves zero-shot transfer of a single pretrained policy to unseen morphologies without fine-tuning, attaining state-of-the-art stability and adaptability in dynamic locomotion.
📝 Abstract
Deep Reinforcement Learning techniques are achieving state-of-the-art results in robust legged locomotion. While there exists a wide variety of legged platforms such as quadruped, humanoids, and hexapods, the field is still missing a single learning framework that can control all these different embodiments easily and effectively and possibly transfer, zero or few-shot, to unseen robot embodiments. We introduce URMA, the Unified Robot Morphology Architecture, to close this gap. Our framework brings the end-to-end Multi-Task Reinforcement Learning approach to the realm of legged robots, enabling the learned policy to control any type of robot morphology. The key idea of our method is to allow the network to learn an abstract locomotion controller that can be seamlessly shared between embodiments thanks to our morphology-agnostic encoders and decoders. This flexible architecture can be seen as a potential first step in building a foundation model for legged robot locomotion. Our experiments show that URMA can learn a locomotion policy on multiple embodiments that can be easily transferred to unseen robot platforms in simulation and the real world.