π€ AI Summary
Robot manipulation policies exhibit poor generalization across morphologically distinct embodiments and lack standardized evaluation protocols. Method: We propose the first benchmark for cross-morphology manipulation tasks, covering fundamental grasping and pushing tasks, and enabling systematic assessment of interpolation, extrapolation, and composition-based generalization. We introduce a three-axis evaluation framework that formally defines and quantifies βmorphology-agnostic manipulation policy generalization.β Our approach employs a morphology-aware policy architecture trained via multi-morphology joint reinforcement learning and grounded in structured simulation modeling to enable zero-shot transfer. Contribution/Results: This benchmark fills a critical gap in the field. Experiments reveal substantial performance degradation under morphological extrapolation; morphology-aware training consistently outperforms single-morphology baselines, yet zero-shot generalization across structurally divergent embodiments remains a fundamental challenge.
π Abstract
Generalizing control policies to novel embodiments remains a fundamental challenge in enabling scalable and transferable learning in robotics. While prior works have explored this in locomotion, a systematic study in the context of manipulation tasks remains limited, partly due to the lack of standardized benchmarks. In this paper, we introduce a benchmark for learning cross-embodiment manipulation, focusing on two foundational tasks-reach and push-across a diverse range of morphologies. The benchmark is designed to test generalization along three axes: interpolation (testing performance within a robot category that shares the same link structure), extrapolation (testing on a robot with a different link structure), and composition (testing on combinations of link structures). On the benchmark, we evaluate the ability of different RL policies to learn from multiple morphologies and to generalize to novel ones. Our study aims to answer whether morphology-aware training can outperform single-embodiment baselines, whether zero-shot generalization to unseen morphologies is feasible, and how consistently these patterns hold across different generalization regimes. The results highlight the current limitations of multi-embodiment learning and provide insights into how architectural and training design choices influence policy generalization.