Learning Environment-Aware Affordance for 3D Articulated Object Manipulation under Occlusions

📅 2023-09-14

🏛️ Neural Information Processing Systems

📈 Citations: 30

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Existing point-level affordance methods are limited to single-object, homogeneous manipulator settings and fail to model real-world challenges—including occlusion, geometric constraints, and robot embodiment—leading to poor generalization in complex scenes. This work introduces the first environment-aware affordance framework for domestic assistive robots. Our approach jointly models the environment and objects by integrating 3D articulation structure understanding, occlusion-aware geometry, and robot kinematic constraints. We propose a contrastive affordance learning mechanism that enables efficient generalization from training with single occluders to deployment in multi-occluder configurations. Implemented via point-cloud representation and contrastive learning, our method significantly improves accuracy in locating movable parts and predicting manipulation intent under occlusion, both on synthetic and real-world datasets. Experiments demonstrate strong generalization across varying occlusion complexity, validating the framework’s robustness for practical robotic interaction.

📝 Abstract

Perceiving and manipulating 3D articulated objects in diverse environments is essential for home-assistant robots. Recent studies have shown that point-level affordance provides actionable priors for downstream manipulation tasks. However, existing works primarily focus on single-object scenarios with homogeneous agents, overlooking the realistic constraints imposed by the environment and the agent's morphology, e.g., occlusions and physical limitations. In this paper, we propose an environment-aware affordance framework that incorporates both object-level actionable priors and environment constraints. Unlike object-centric affordance approaches, learning environment-aware affordance faces the challenge of combinatorial explosion due to the complexity of various occlusions, characterized by their quantities, geometries, positions and poses. To address this and enhance data efficiency, we introduce a novel contrastive affordance learning framework capable of training on scenes containing a single occluder and generalizing to scenes with complex occluder combinations. Experiments demonstrate the effectiveness of our proposed approach in learning affordance considering environment constraints. Project page at https://chengkaiacademycity.github.io/EnvAwareAfford/

Problem

Research questions and friction points this paper is trying to address.

Learning affordance for 3D articulated object manipulation

Addressing environmental constraints like occlusions and limitations

Generalizing from single to complex occluder combinations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Environment-aware affordance framework with constraints

Contrastive learning for single to complex occlusions

Generalizable to diverse occluder combinations efficiently

🔎 Similar Papers

UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models