NeuraLoc: Visual Localization in Neural Implicit Map with Dual Complementary Features

📅 2025-03-08

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Existing NeRF-based visual localization methods suffer from three key limitations: lack of explicit geometric constraints, prohibitive memory overhead for feature storage, and unreliable matching due to semantic ambiguity. This paper proposes an efficient localization framework grounded in neural implicit mapping and complementary feature representation. We introduce the first joint modeling of an implicit 3D keypoint descriptor field and a semantic context feature field. To mitigate domain shift between 2D and 3D features, we design a descriptor similarity distribution alignment strategy. Furthermore, we construct a graph-structured 2D–3D correspondence module and integrate it into an end-to-end 6-DoF pose estimation pipeline. Our method accelerates training by 3×, reduces model size by 45×, and achieves state-of-the-art or superior performance on ScanNet and 7Scenes benchmarks.

Technology Category

Application Category

📝 Abstract

Recently, neural radiance fields (NeRF) have gained significant attention in the field of visual localization. However, existing NeRF-based approaches either lack geometric constraints or require extensive storage for feature matching, limiting their practical applications. To address these challenges, we propose an efficient and novel visual localization approach based on the neural implicit map with complementary features. Specifically, to enforce geometric constraints and reduce storage requirements, we implicitly learn a 3D keypoint descriptor field, avoiding the need to explicitly store point-wise features. To further address the semantic ambiguity of descriptors, we introduce additional semantic contextual feature fields, which enhance the quality and reliability of 2D-3D correspondences. Besides, we propose descriptor similarity distribution alignment to minimize the domain gap between 2D and 3D feature spaces during matching. Finally, we construct the matching graph using both complementary descriptors and contextual features to establish accurate 2D-3D correspondences for 6-DoF pose estimation. Compared with the recent NeRF-based approaches, our method achieves a 3$ imes$ faster training speed and a 45$ imes$ reduction in model storage. Extensive experiments on two widely used datasets demonstrate that our approach outperforms or is highly competitive with other state-of-the-art NeRF-based visual localization methods. Project page: href{https://zju3dv.github.io/neuraloc}{https://zju3dv.github.io/neuraloc}

Problem

Research questions and friction points this paper is trying to address.

Addresses lack of geometric constraints in NeRF-based visual localization.

Reduces storage requirements by learning 3D keypoint descriptor fields implicitly.

Enhances 2D-3D correspondences with semantic contextual feature fields.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit 3D keypoint descriptor field learning

Semantic contextual feature fields integration

Descriptor similarity distribution alignment technique

🔎 Similar Papers

Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers