🤖 AI Summary
Robots struggle to accurately recognize object affordances—functional properties enabling physical interactions—in real-world environments. Method: This paper proposes an affordance modeling framework integrating active exploration, object-level semantic SLAM, and interactive learning. It embeds object-level semantic maps into the exploration loop to enable cross-view object instance recognition and persistent tracking; further, it combines multi-view consistency modeling with interaction-driven data collection to train a robot-specific affordance prediction model. Contribution/Results: Experiments demonstrate a 37% improvement in exploration efficiency and a 12.6% absolute gain in affordance prediction accuracy over state-of-the-art methods. The framework is validated on a real robotic platform, confirming its generalizability across unseen objects and practical utility in downstream manipulation tasks.
📝 Abstract
Many robotic tasks in real-world environments require physical interactions with an object such as pick up or push. For successful interactions, the robot needs to know the object's affordances, which are defined as the potential actions the robot can perform with the object. In order to learn a robot-specific affordance predictor, we propose an interactive exploration pipeline which allows the robot to collect interaction experiences while exploring an unknown environment. We integrate an object-level map in the exploration pipeline such that the robot can identify different object instances and track objects across diverse viewpoints. This results in denser and more accurate affordance annotations compared to state-of-the-art methods, which do not incorporate a map. We show that our affordance exploration approach makes exploration more efficient and results in more accurate affordance prediction models compared to baseline methods.