Affostruction: 3D Affordance Grounding with Generative Reconstruction

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the challenge of text-guided operable region localization in RGB-D images, where missing observations hinder accurate prediction. The authors propose a generative framework that first fully reconstructs object geometry and then predicts operability over the complete shape. Key innovations include multi-view generative reconstruction via sparse voxel fusion, manifold-based modeling of operability distributions to capture semantic ambiguity, and an operability-driven active viewpoint selection strategy. Experimental results demonstrate significant improvements over existing methods, achieving an aIoU of 19.1 for operable region localization (a 40.4% relative gain) and a 3D reconstruction IoU of 32.67 (a 67.7% relative improvement).

Technology Category

Application Category

📝 Abstract

This paper addresses the problem of affordance grounding from RGBD images of an object, which aims to localize surface regions corresponding to a text query that describes an action on the object. While existing methods predict affordance regions only on visible surfaces, we propose Affostruction, a generative framework that reconstructs complete geometry from partial observations and grounds affordances on the full shape including unobserved regions. We make three core contributions: generative multi-view reconstruction via sparse voxel fusion that extrapolates unseen geometry while maintaining constant token complexity, flow-based affordance grounding that captures inherent ambiguity in affordance distributions, and affordance-driven active view selection that leverages predicted affordances for intelligent viewpoint sampling. Affostruction achieves 19.1 aIoU on affordance grounding (40.4\% improvement) and 32.67 IoU for 3D reconstruction (67.7\% improvement), enabling accurate affordance prediction on complete shapes.

Problem

Research questions and friction points this paper is trying to address.

affordance grounding

3D reconstruction

RGBD images

generative modeling

unobserved regions

Innovation

Methods, ideas, or system contributions that make the work stand out.

affordance grounding

generative reconstruction

3D shape completion