🤖 AI Summary
This work addresses the challenge of policy transfer across diverse robotic end-effectors—such as simple grippers and dexterous hands—whose structural and action-space differences hinder shared control strategies. To overcome this, the authors propose the One-Policy-Fits-All framework, which constructs a geometry-aware implicit action representation. By integrating 3D convolutions with Transformers, the method generates a morphology-agnostic latent action space and employs a unified redirection decoder to enable end-to-end joint training across end-effectors with arbitrary degrees of freedom. Experiments across 11 distinct morphologies demonstrate that cross-morphology joint training improves task success rates by over 50% compared to single-source training. Moreover, the approach achieves performance comparable to conventional methods using only 8 demonstrations on a new morphology—versus 72 required by prior approaches—significantly enhancing data efficiency and skill transferability.
📝 Abstract
Cross-embodiment manipulation is crucial for enhancing the scalability of robot manipulation and reducing the high cost of data collection. However, the significant differences between embodiments, such as variations in action spaces and structural disparities, pose challenges for joint training across multiple sources of data. To address this, we propose One-Policy-Fits-All (OPFA), a framework that enables learning a single, versatile policy across multiple embodiments. We first learn a Geometry-Aware Latent Representation (GaLR), which leverages 3D convolution networks and transformers to build a shared latent action space across different embodiments. Then we design a unified latent retargeting decoder that extracts embodiment-specific actions from the latent representations, without any embodiment-specific decoder tuning. OPFA enables end-to-end co-training of data from diverse embodiments, including various grippers and dexterous hands with arbitrary degrees of freedom, significantly improving data efficiency and reducing the cost of skill transfer. We conduct extensive experiments across 11 different end-effectors. The results demonstrate that OPFA significantly improves policy performance in diverse settings by leveraging heterogeneous embodiment data. For instance, cross-embodiment co-training can improve success rates by more than 50% compared to single-source training. Moreover, by adding only a few demonstrations from a new embodiment (e.g., eight), OPFA can achieve performance comparable to that of a well-trained model with 72 demonstrations.