SLC$^2$-SLAM: Semantic-guided Loop Closure with Shared Latent Code for NeRF SLAM

📅 2025-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the degradation of tracking and reconstruction accuracy in NeRF-based SLAM caused by accumulated pose drift in large-scale scenes, this paper proposes a geometry-semantic co-driven loop closure detection and optimization method. The approach jointly leverages NeRF’s implicit scene representation—specifically, its learned latent codes—to encode both geometric structure and semantic information as shared local features. A homologous semantic-guided weighted feature aggregation mechanism is introduced to enhance loop closure matching robustness. Furthermore, graph optimization is tightly integrated with bundle adjustment to jointly refine camera poses. Evaluated on the Replica and ScanNet datasets, the method significantly outperforms NetVLAD and ORB+BoW baselines, achieving substantial improvements in loop closure recall and reconstruction accuracy—particularly in large-scale, challenging environments such as ScanNet.

Technology Category

Application Category

📝 Abstract
Targeting the notorious cumulative drift errors in NeRF SLAM, we propose a Semantic-guided Loop Closure with Shared Latent Code, dubbed SLC$^2$-SLAM. Especially, we argue that latent codes stored in many NeRF SLAM systems are not fully exploited, as they are only used for better reconstruction. In this paper, we propose a simple yet effective way to detect potential loops using the same latent codes as local features. To further improve the loop detection performance, we use the semantic information, which are also decoded from the same latent codes to guide the aggregation of local features. Finally, with the potential loops detected, we close them with a graph optimization followed by bundle adjustment to refine both the estimated poses and the reconstructed scene. To evaluate the performance of our SLC$^2$-SLAM, we conduct extensive experiments on Replica and ScanNet datasets. Our proposed semantic-guided loop closure significantly outperforms the pre-trained NetVLAD and ORB combined with Bag-of-Words, which are used in all the other NeRF SLAM with loop closure. As a result, our SLC$^2$-SLAM also demonstrated better tracking and reconstruction performance, especially in larger scenes with more loops, like ScanNet.
Problem

Research questions and friction points this paper is trying to address.

NeRF SLAM
path drift
large-scale scenes
Innovation

Methods, ideas, or system contributions that make the work stand out.

SLC²-SLAM
NeRF SLAM Optimization
Semantic Understanding
🔎 Similar Papers
No similar papers found.
Yuhang Ming
Yuhang Ming
Lecturer at Hangzhou Dianzi University
SLAMVPRComputer VisionRoboticsSpatial AI
D
Di Ma
School of Computer Science, Hangzhou Dianzi University and Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province, Hangzhou, 310018, China
Weichen Dai
Weichen Dai
Hangzhou Dianzi University
3D VisionSLAMBrain-inspired intelligence
G
Guofeng Zhang
State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058, China
W
Wanzeng Kong
School of Computer Science, Hangzhou Dianzi University and Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province, Hangzhou, 310018, China