NeuraLoc: Visual Localization in Neural Implicit Map with Dual Complementary Features

šŸ“… 2025-03-08
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
Existing NeRF-based visual localization methods suffer from three key limitations: lack of explicit geometric constraints, prohibitive memory overhead for feature storage, and unreliable matching due to semantic ambiguity. This paper proposes an efficient localization framework grounded in neural implicit mapping and complementary feature representation. We introduce the first joint modeling of an implicit 3D keypoint descriptor field and a semantic context feature field. To mitigate domain shift between 2D and 3D features, we design a descriptor similarity distribution alignment strategy. Furthermore, we construct a graph-structured 2D–3D correspondence module and integrate it into an end-to-end 6-DoF pose estimation pipeline. Our method accelerates training by 3Ɨ, reduces model size by 45Ɨ, and achieves state-of-the-art or superior performance on ScanNet and 7Scenes benchmarks.

Technology Category

Application Category

šŸ“ Abstract
Recently, neural radiance fields (NeRF) have gained significant attention in the field of visual localization. However, existing NeRF-based approaches either lack geometric constraints or require extensive storage for feature matching, limiting their practical applications. To address these challenges, we propose an efficient and novel visual localization approach based on the neural implicit map with complementary features. Specifically, to enforce geometric constraints and reduce storage requirements, we implicitly learn a 3D keypoint descriptor field, avoiding the need to explicitly store point-wise features. To further address the semantic ambiguity of descriptors, we introduce additional semantic contextual feature fields, which enhance the quality and reliability of 2D-3D correspondences. Besides, we propose descriptor similarity distribution alignment to minimize the domain gap between 2D and 3D feature spaces during matching. Finally, we construct the matching graph using both complementary descriptors and contextual features to establish accurate 2D-3D correspondences for 6-DoF pose estimation. Compared with the recent NeRF-based approaches, our method achieves a 3$ imes$ faster training speed and a 45$ imes$ reduction in model storage. Extensive experiments on two widely used datasets demonstrate that our approach outperforms or is highly competitive with other state-of-the-art NeRF-based visual localization methods. Project page: href{https://zju3dv.github.io/neuraloc}{https://zju3dv.github.io/neuraloc}
Problem

Research questions and friction points this paper is trying to address.

Addresses lack of geometric constraints in NeRF-based visual localization.
Reduces storage requirements by learning 3D keypoint descriptor fields implicitly.
Enhances 2D-3D correspondences with semantic contextual feature fields.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit 3D keypoint descriptor field learning
Semantic contextual feature fields integration
Descriptor similarity distribution alignment technique
šŸ”Ž Similar Papers
No similar papers found.
Hongjia Zhai
Hongjia Zhai
PhD student in Computer Science, State Key Lab of CAD&CG, Zhejiang University
Boming Zhao
Boming Zhao
Computer Science, Zhejiang University
3D Vision
H
Hai Li
RayNeo
X
Xiaokun Pan
State Key Lab of CAD&CG, Zhejiang University
Yijia He
Yijia He
Tencent XR Vision Labs
SLAM/VIO3D Vision
Zhaopeng Cui
Zhaopeng Cui
Zhejiang University
Computer VisionRoboticsComputer Graphics
H
Hujun Bao
State Key Lab of CAD&CG, Zhejiang University
G
Guofeng Zhang
State Key Lab of CAD&CG, Zhejiang University