🤖 AI Summary
To address the low sample efficiency and poor generalization of imitation learning in long-horizon, multi-object robotic manipulation tasks, this paper proposes an affordance-centered task coordinate framework. The framework defines and orients a task coordinate system directly on object affordances, jointly achieving intra-class invariance and spatial invariance—thereby significantly improving policy generalization across object instances and poses. Our method integrates a general-purpose large vision model for robust affordance perception and tracking, and couples behavior cloning with coordinate-system alignment for policy learning. With only 10 demonstration trajectories, it matches the generalization performance of an end-to-end baseline trained on 305 image samples, while enabling robust cross-instance manipulation. The code and validation videos are publicly available.
📝 Abstract
Affordances are central to robotic manipulation, where most tasks can be simplified to interactions with task-specific regions on objects. By focusing on these key regions, we can abstract away task-irrelevant information, simplifying the learning process, and enhancing generalisation. In this paper, we propose an affordance-centric policy-learning approach that centres and appropriately extit{orients} a extit{task frame} on these affordance regions allowing us to achieve both extbf{intra-category invariance} -- where policies can generalise across different instances within the same object category -- and extbf{spatial invariance} -- which enables consistent performance regardless of object placement in the environment. We propose a method to leverage existing generalist large vision models to extract and track these affordance frames, and demonstrate that our approach can learn manipulation tasks using behaviour cloning from as little as 10 demonstrations, with equivalent generalisation to an image-based policy trained on 305 demonstrations. We provide video demonstrations on our project site: https://affordance-policy.github.io.