SG-Reg: Generalizable and Efficient Scene Graph Registration

πŸ“… 2025-04-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the rigid registration challenge of semantic scene graphs in multi-agent and cross-temporal scenarios, where conventional handcrafted descriptors suffer from poor generalizability and existing learning-based methods rely heavily on ground-truth annotations. To overcome these limitations, we propose a novel annotation-free data generation paradigm leveraging vision foundation models (VFMs) to reconstruct semantic scene graphs. We design a compact node representation integrating open-vocabulary semantics, spatial topology, and geometric shape priors. Furthermore, we introduce a multimodal graph neural network architecture that jointly performs coarse-to-fine matching, robust pose estimation, and sparse hierarchical scene representation. Evaluated on a dual-agent SLAM benchmark, our method achieves significantly higher registration success rates than handcrafted feature-based approaches and marginally outperforms visual loop-closure networks in recall. With only 52 KB per-frame communication bandwidth, it demonstrates superior efficiency, generalizability, and practical applicability.

Technology Category

Application Category

πŸ“ Abstract
This paper addresses the challenges of registering two rigid semantic scene graphs, an essential capability when an autonomous agent needs to register its map against a remote agent, or against a prior map. The hand-crafted descriptors in classical semantic-aided registration, or the ground-truth annotation reliance in learning-based scene graph registration, impede their application in practical real-world environments. To address the challenges, we design a scene graph network to encode multiple modalities of semantic nodes: open-set semantic feature, local topology with spatial awareness, and shape feature. These modalities are fused to create compact semantic node features. The matching layers then search for correspondences in a coarse-to-fine manner. In the back-end, we employ a robust pose estimator to decide transformation according to the correspondences. We manage to maintain a sparse and hierarchical scene representation. Our approach demands fewer GPU resources and fewer communication bandwidth in multi-agent tasks. Moreover, we design a new data generation approach using vision foundation models and a semantic mapping module to reconstruct semantic scene graphs. It differs significantly from previous works, which rely on ground-truth semantic annotations to generate data. We validate our method in a two-agent SLAM benchmark. It significantly outperforms the hand-crafted baseline in terms of registration success rate. Compared to visual loop closure networks, our method achieves a slightly higher registration recall while requiring only 52 KB of communication bandwidth for each query frame. Code available at: href{http://github.com/HKUST-Aerial-Robotics/SG-Reg}{http://github.com/HKUST-Aerial-Robotics/SG-Reg}.
Problem

Research questions and friction points this paper is trying to address.

Registering rigid semantic scene graphs efficiently
Reducing reliance on ground-truth annotations
Minimizing GPU and communication bandwidth usage
Innovation

Methods, ideas, or system contributions that make the work stand out.

Scene graph network encodes multiple semantic modalities
Coarse-to-fine matching layers find correspondences efficiently
Vision foundation models generate data without ground-truth
πŸ”Ž Similar Papers
No similar papers found.
Chuhao Liu
Chuhao Liu
Ph.D. Candidate at Hong Kong University of Science and Technology
Semantic Scene UnderstandingRoboticsSLAM
Z
Zhijian Qiao
Department of Electronic and Computer Engineering, the Hong Kong University of Science and Technology, Hong Kong, China
J
Jieqi Shi
School of Intelligence Science and Technology, Nanjing University, Jiangsu, China
K
Ke Wang
School of Information Engineering, Chang’an University
P
Peize Liu
Department of Electronic and Computer Engineering, the Hong Kong University of Science and Technology, Hong Kong, China
Shaojie Shen
Shaojie Shen
Associate Professor, Hong Kong University of Science and Technology
Robotics