One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion

📅 2024-09-10

🏛️ Conference on Robot Learning

📈 Citations: 7

✨ Influential: 1

career value

223K/year

🤖 AI Summary

Existing legged robot control frameworks lack generalization across diverse morphologies (e.g., quadrupedal, bipedal, hexapedal) and exhibit no zero-shot or few-shot cross-morphology transfer capability. To address this, we propose URMA—a Unified Reinforcement Learning Architecture—that introduces a morphology-agnostic encoder-decoder structure. URMA employs morphology-aware input normalization, a shared latent-space policy network, and plug-and-play morphology-adaptation modules, enabling end-to-end single-policy control of heterogeneous legged robots. It is the first work to extend multi-task reinforcement learning to cross-morphology general locomotion control, advancing embodied foundation models. Extensive evaluation in simulation and on real robots demonstrates that URMA achieves zero-shot transfer of a single pretrained policy to unseen morphologies without fine-tuning, attaining state-of-the-art stability and adaptability in dynamic locomotion.

Technology Category

Application Category

📝 Abstract

Deep Reinforcement Learning techniques are achieving state-of-the-art results in robust legged locomotion. While there exists a wide variety of legged platforms such as quadruped, humanoids, and hexapods, the field is still missing a single learning framework that can control all these different embodiments easily and effectively and possibly transfer, zero or few-shot, to unseen robot embodiments. We introduce URMA, the Unified Robot Morphology Architecture, to close this gap. Our framework brings the end-to-end Multi-Task Reinforcement Learning approach to the realm of legged robots, enabling the learned policy to control any type of robot morphology. The key idea of our method is to allow the network to learn an abstract locomotion controller that can be seamlessly shared between embodiments thanks to our morphology-agnostic encoders and decoders. This flexible architecture can be seen as a potential first step in building a foundation model for legged robot locomotion. Our experiments show that URMA can learn a locomotion policy on multiple embodiments that can be easily transferred to unseen robot platforms in simulation and the real world.

Problem

Research questions and friction points this paper is trying to address.

Develop a unified learning framework for diverse legged robots

Enable zero or few-shot transfer to unseen robot embodiments

Create an abstract locomotion controller for multiple morphologies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified end-to-end Multi-Task Reinforcement Learning framework

Morphology-agnostic encoders and decoders for flexibility

Zero or few-shot transfer to unseen robot embodiments

🔎 Similar Papers

No similar papers found.

Field AI

Irvine, CA

Robotics Research Intern, Robot Learning (Summer 2026) | PhD Internship

Field AI

Pittsburgh, PA

Research Scientist Intern, Robotic Control Policy (PhD)