UniLegs: Universal Multi-Legged Robot Control through Morphology-Agnostic Policy Distillation

📅 2025-07-30

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

To address the poor generalizability of morphology-specific control policies for legged robots, this paper proposes a two-stage teacher–student framework. First, dedicated reinforcement learning policies (teachers) are trained individually for five distinct legged morphologies. Second, knowledge from these teachers is distilled into a morphology-agnostic student policy implemented as a Transformer architecture. The student leverages self-attention to model inter-joint dynamical relationships across morphologies, enabling strong zero-shot generalization to unseen configurations. Experiments show that the student achieves 94.47% of teacher performance on trained morphologies and maintains 72.64% on entirely unseen ones. Moreover, the student policy is successfully deployed on a real quadrupedal robot. This work represents the first application of Transformers to cross-morphology policy distillation in legged robotics, significantly advancing the performance and generalizability of universal locomotion controllers.

Technology Category

Application Category

📝 Abstract

Developing controllers that generalize across diverse robot morphologies remains a significant challenge in legged locomotion. Traditional approaches either create specialized controllers for each morphology or compromise performance for generality. This paper introduces a two-stage teacher-student framework that bridges this gap through policy distillation. First, we train specialized teacher policies optimized for individual morphologies, capturing the unique optimal control strategies for each robot design. Then, we distill this specialized expertise into a single Transformer-based student policy capable of controlling robots with varying leg configurations. Our experiments across five distinct legged morphologies demonstrate that our approach preserves morphology-specific optimal behaviors, with the Transformer architecture achieving 94.47% of teacher performance on training morphologies and 72.64% on unseen robot designs. Comparative analysis reveals that Transformer-based architectures consistently outperform MLP baselines by leveraging attention mechanisms to effectively model joint relationships across different kinematic structures. We validate our approach through successful deployment on a physical quadruped robot, demonstrating the practical viability of our morphology-agnostic control framework. This work presents a scalable solution for developing universal legged robot controllers that maintain near-optimal performance while generalizing across diverse morphologies.

Problem

Research questions and friction points this paper is trying to address.

Developing controllers for diverse legged robot morphologies

Bridging performance gap between specialized and general controllers

Creating universal control policy for varying leg configurations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage teacher-student framework for control

Transformer-based policy for diverse leg configurations

Attention mechanisms model joint kinematic relationships

🔎 Similar Papers

One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion

2024-09-10Conference on Robot LearningCitations: 7

Field AI

Irvine, CA

Research Scientist Intern, Robotic Control Policy (PhD)