RelationField: Relate Anything in Radiance Fields

📅 2024-12-18

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

159K/year

🤖 AI Summary

Existing NeRF methods focus on object-level representations and lack explicit modeling of inter-object semantic relationships. Method: This work introduces the first approach to directly extract open-vocabulary 3D semantic relations from NeRFs, implicitly encoding relations as ray-pairs and designing a differentiable relation query network. It integrates knowledge distillation from multimodal large language models (MLLMs) with open-vocabulary vision-language features to establish the first relation-prior-guided framework for radiance fields. Contribution/Results: Our method achieves state-of-the-art performance on both open-vocabulary 3D scene graph generation and relation-guided instance segmentation. It significantly enhances cross-object semantic relationship understanding, marking a critical step toward structured, reasoning-capable 3D scene understanding with NeRF.

Technology Category

Application Category

📝 Abstract

Neural radiance fields are an emerging 3D scene representation and recently even been extended to learn features for scene understanding by distilling open-vocabulary features from vision-language models. However, current method primarily focus on object-centric representations, supporting object segmentation or detection, while understanding semantic relationships between objects remains largely unexplored. To address this gap, we propose RelationField, the first method to extract inter-object relationships directly from neural radiance fields. RelationField represents relationships between objects as pairs of rays within a neural radiance field, effectively extending its formulation to include implicit relationship queries. To teach RelationField complex, open-vocabulary relationships, relationship knowledge is distilled from multi-modal LLMs. To evaluate RelationField, we solve open-vocabulary 3D scene graph generation tasks and relationship-guided instance segmentation, achieving state-of-the-art performance in both tasks. See the project website at https://relationfield.github.io.

Problem

Research questions and friction points this paper is trying to address.

Extract inter-object relationships from neural radiance fields

Teach complex open-vocabulary relationships using multi-modal LLMs

Solve open-vocabulary 3D scene graph generation tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts inter-object relationships from radiance fields

Represents relationships as ray pairs in fields

Distills relationship knowledge from multi-modal LLMs

🔎 Similar Papers

NeRF in Robotics: A Survey