Relational Semantic Reasoning on 3D Scene Graphs for Open World Interactive Object Search

📅 2026-03-05

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This work addresses the challenge of interactive object search in open-world home environments, where effective modeling of semantic relationships and contextual cues among objects is crucial yet hindered by the unreliability of vision-language embeddings or the high computational cost of large language models (LLMs). To overcome these limitations, the authors propose SCOUT, a novel approach that integrates relational semantic reasoning with 3D scene graphs. SCOUT leverages heuristic rules—such as room-object containment and object co-occurrence—to assign utility scores to scene elements and introduces an offline distillation framework to transfer structured knowledge from LLMs into a lightweight model. The accompanying symbolic benchmark, SymSearch, enables scalable evaluation. Experiments demonstrate that SCOUT significantly outperforms embedding-based methods in both simulated and real robotic environments, achieving performance comparable to LLMs while maintaining efficient inference.

Technology Category

Application Category

📝 Abstract

Open-world interactive object search in household environments requires understanding semantic relationships between objects and their surrounding context to guide exploration efficiently. Prior methods either rely on vision-language embeddings similarity, which does not reliably capture task-relevant relational semantics, or large language models (LLMs), which are too slow and costly for real-time deployment. We introduce SCOUT: Scene Graph-Based Exploration with Learned Utility for Open-World Interactive Object Search, a novel method that searches directly over 3D scene graphs by assigning utility scores to rooms, frontiers, and objects using relational exploration heuristics such as room-object containment and object-object co-occurrence. To make this practical without sacrificing open-vocabulary generalization, we propose an offline procedural distillation framework that extracts structured relational knowledge from LLMs into lightweight models for on-robot inference. Furthermore, we present SymSearch, a scalable symbolic benchmark for evaluating semantic reasoning in interactive object search tasks. Extensive evaluations across symbolic and simulation environments show that SCOUT outperforms embedding similarity-based methods and matches LLM-level performance while remaining computationally efficient. Finally, real-world experiments demonstrate effective transfer to physical environments, enabling open-world interactive object search under realistic sensing and navigation constraints.

Problem

Research questions and friction points this paper is trying to address.

interactive object search

3D scene graphs

relational semantic reasoning

open-world

semantic relationships

Innovation

Methods, ideas, or system contributions that make the work stand out.

3D scene graphs

relational semantic reasoning

open-world object search