Relational Semantic Reasoning on 3D Scene Graphs for Open World Interactive Object Search

📅 2026-03-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of interactive object search in open-world home environments, where effective modeling of semantic relationships and contextual cues among objects is crucial yet hindered by the unreliability of vision-language embeddings or the high computational cost of large language models (LLMs). To overcome these limitations, the authors propose SCOUT, a novel approach that integrates relational semantic reasoning with 3D scene graphs. SCOUT leverages heuristic rules—such as room-object containment and object co-occurrence—to assign utility scores to scene elements and introduces an offline distillation framework to transfer structured knowledge from LLMs into a lightweight model. The accompanying symbolic benchmark, SymSearch, enables scalable evaluation. Experiments demonstrate that SCOUT significantly outperforms embedding-based methods in both simulated and real robotic environments, achieving performance comparable to LLMs while maintaining efficient inference.

Technology Category

Application Category

📝 Abstract
Open-world interactive object search in household environments requires understanding semantic relationships between objects and their surrounding context to guide exploration efficiently. Prior methods either rely on vision-language embeddings similarity, which does not reliably capture task-relevant relational semantics, or large language models (LLMs), which are too slow and costly for real-time deployment. We introduce SCOUT: Scene Graph-Based Exploration with Learned Utility for Open-World Interactive Object Search, a novel method that searches directly over 3D scene graphs by assigning utility scores to rooms, frontiers, and objects using relational exploration heuristics such as room-object containment and object-object co-occurrence. To make this practical without sacrificing open-vocabulary generalization, we propose an offline procedural distillation framework that extracts structured relational knowledge from LLMs into lightweight models for on-robot inference. Furthermore, we present SymSearch, a scalable symbolic benchmark for evaluating semantic reasoning in interactive object search tasks. Extensive evaluations across symbolic and simulation environments show that SCOUT outperforms embedding similarity-based methods and matches LLM-level performance while remaining computationally efficient. Finally, real-world experiments demonstrate effective transfer to physical environments, enabling open-world interactive object search under realistic sensing and navigation constraints.
Problem

Research questions and friction points this paper is trying to address.

interactive object search
3D scene graphs
relational semantic reasoning
open-world
semantic relationships
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D scene graphs
relational semantic reasoning
open-world object search
procedural distillation
interactive exploration
🔎 Similar Papers
No similar papers found.