🤖 AI Summary
Existing 3D scene graphs (3DSGs) rely on a static-world assumption, treating manipulable obstacles as impassable—leading to poor navigability, low efficiency, and weak generalization for embodied navigation in dynamic real-world environments. To address this, we propose Hierarchical Traversable 3DSG, the first 3DSG framework that explicitly encodes obstacle functionality and manipulability into the graph structure, thereby redefining “traversability” to unify semantic, spatial, and interactive relationships. Our method integrates functional semantic parsing, mobility classification, and relational reasoning within a hierarchical graph neural network architecture augmented with physics-aware action modeling. Experiments demonstrate significant improvements: a 35.1% reduction in path length under partial occlusion and a 79.4% increase in success rate under full occlusion. The approach substantially enhances long-horizon reasoning, interactive planning, and adaptability to dynamic scenes.
📝 Abstract
3D Scene Graphs (3DSGs) constitute a powerful representation of the physical world, distinguished by their abilities to explicitly model the complex spatial, semantic, and functional relationships between entities, rendering a foundational understanding that enables agents to interact intelligently with their environment and execute versatile behaviors. Embodied navigation, as a crucial component of such capabilities, leverages the compact and expressive nature of 3DSGs to enable long-horizon reasoning and planning in complex, large-scale environments. However, prior works rely on a static-world assumption, defining traversable space solely based on static spatial layouts and thereby treating interactable obstacles as non-traversable. This fundamental limitation severely undermines their effectiveness in real-world scenarios, leading to limited reachability, low efficiency, and inferior extensibility. To address these issues, we propose HERO, a novel framework for constructing Hierarchical Traversable 3DSGs, that redefines traversability by modeling operable obstacles as pathways, capturing their physical interactivity, functional semantics, and the scene's relational hierarchy. The results show that, relative to its baseline, HERO reduces PL by 35.1% in partially obstructed environments and increases SR by 79.4% in fully obstructed ones, demonstrating substantially higher efficiency and reachability.