🤖 AI Summary
This work addresses the challenge of affordance recognition for embodied robot task planning in urban environments, where sparse 3D scenes hinder commonsense understanding of object functionality. We propose the Correlation-Expected Conceptual Inference (CECI) model—the first to integrate expectation probability modeling with graph convolutional networks (GCNs) for fine-grained, intra-class semantic affordance estimation grounded in 3D scene graphs. Our method jointly leverages probabilistic distribution learning and structured scene reasoning, eliminating the need for dense supervision. Evaluated on real indoor environments, CECI achieves strong alignment with human commonsense judgments (Cohen’s κ = 0.87) and outperforms existing baselines by +12.6% in mean Average Precision (mAP). The core contribution is the first task-driven, generalizable probabilistic affordance reasoning framework, significantly enhancing robots’ understanding of intrinsic object functionality and enabling more effective task optimization.
📝 Abstract
This article studies the commonsense object affordance concept for enabling close-to-human task planning and task optimization of embodied robotic agents in urban environments. The focus of the object affordance is on reasoning how to effectively identify object’s inherent utility during the task execution, which in this work is enabled through the analysis of contextual relations of sparse information of 3D scene graphs. The proposed framework develops a Correlation Information (CECI) model to learn probability distributions using a Graph Convolutional Network, allowing to extract the commonsense affordance for individual members of a semantic class. The overall framework was experimentally validated in a real-world indoor environment, showcasing the ability of the method to level with human commonsense. For a video of the article, showcasing the experimental demonstration, please refer to the following link: https://youtu.be/BDCMVx2GiQE