Learning-Based Hierarchical Scene Graph Matching for Robot Localization Leveraging Prior Maps

📅 2026-04-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

223K/year
🤖 AI Summary
This work addresses the inefficiency and insufficient accuracy of matching scene graphs with hierarchical semantic maps when robots localize indoors using prior BIM maps. To overcome these limitations, the authors propose the first end-to-end differentiable framework for hierarchical graph matching. The method explicitly models multi-granular structural relationships—from room-level to wall-level—by introducing intra-layer and inter-layer semantic edges, and incorporates a floorplan-based training strategy to achieve zero-shot generalization. Experimental results on real-world LiDAR data demonstrate that the proposed approach outperforms combinatorial optimization baselines in F1 score while running an order of magnitude faster, significantly enhancing both matching accuracy and practicality in large-scale indoor environments.
📝 Abstract
Accurate localization is a fundamental requirement for autonomous robots operating in indoor environments. Scene graphs encode the spatial structure of an environment as a hierarchy of semantic entities and their relationships, and can be constructed both online from robot sensor data and offline from architectural priors such as Building Information Models (BIM). Matching these two complementary representations enables drift correction in SLAM by grounding robot observations against a known structural prior. However, establishing reliable node-to-node correspondences between them remains an open challenge: existing combinatorial methods are prohibitively expensive at scale, and prior learned approaches address only flat graph matching, ignoring the multi-level semantic structure present in both representations. Here we present a learned, end-to-end differentiable pipeline that augments both graphs with semantically motivated edge types encoding intra- and inter- level relationships, explicitly exploiting this hierarchy to enable simultaneous matching from high-level room concepts down to low-level wall surfaces. Trained exclusively on floor plans, the proposed method outperforms the combinatorial baseline in F1 on real LiDAR environments while running an order of magnitude faster, demonstrating viable zero-shot generalization for BIM-assisted robot localization.
Problem

Research questions and friction points this paper is trying to address.

scene graph matching
robot localization
hierarchical structure
semantic correspondence
prior maps
Innovation

Methods, ideas, or system contributions that make the work stand out.

hierarchical scene graph
learned graph matching
BIM-assisted localization
zero-shot generalization
semantic SLAM
🔎 Similar Papers