CEI: A Unified Interface for Cross-Embodiment Visuomotor Policy Learning in 3D Space

📅 2026-01-14

🏛️ IEEE Robotics and Automation Letters

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Existing robotic foundation models struggle to generalize across varying viewpoints, manipulator configurations, and end-effectors—particularly parallel-jaw grippers—due to biases in their training data. To address this limitation, this work proposes a Cross-Embodiment Interface (CEI) framework that aligns heterogeneous robot trajectories through functional similarity. By leveraging directional Chamfer distance and gradient-based optimization, CEI synthesizes observations and actions tailored to novel embodiments, enabling bidirectional policy transfer. The approach supports spatial generalization and multimodal motion generation, successfully transferring policies from a Franka Panda to 16 distinct simulated configurations. In real-world experiments, it achieves an average success rate of 82.4% across six tasks when transferring between UR5 robots equipped with AG95 and XHand end-effectors.

Technology Category

Application Category

📝 Abstract

Robotic foundation models trained on large-scale manipulation datasets have shown promise in learning generalist policies, but they often overfit to specific viewpoints, robot arms, and especially parallel-jaw grippers due to dataset biases. To address this limitation, we propose Cross-Embodiment Interface (CEI), a framework for cross-embodiment learning that enables the transfer of demonstrations across different robot arm and end-effector morphologies. CEI introduces the concept of functional similarity, which is quantified using Directional Chamfer Distance. Then it aligns robot trajectories through gradient-based optimization, followed by synthesizing observations and actions for unseen robot arms and end-effectors. In experiments, CEI transfers data and policies from a Franka Panda robot to 16 different embodiments across 3 tasks in simulation, and supports bidirectional transfer between a UR5+AG95 gripper robot and a UR5+Xhand robot across 6 real-world tasks, achieving an average transfer ratio of 82.4% . Finally, we demonstrate that CEI can also be extended with spatial generalization and multimodal motion generation capabilities using our proposed techniques.

Problem

Research questions and friction points this paper is trying to address.

cross-embodiment

visuomotor policy

robotic generalization

embodiment transfer

foundation models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-Embodiment Learning

Functional Similarity

Directional Chamfer Distance