FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization

šŸ“… 2024-08-21
šŸ›ļø arXiv.org
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
In 2D–3D visual localization, direct matching suffers from high ambiguity due to large search spaces, while hierarchical methods incur substantial memory overhead. To address this trade-off, we propose a lightweight fusion matching framework. First, we effectively integrate global descriptors (e.g., NetVLAD) into the direct matching pipeline. Second, we design a geo-aware feature re-ranking mechanism to suppress interference from distant candidates. Third, we introduce a weighted average fusion operator to jointly optimize multi-scale local descriptors (e.g., SuperPoint) and global descriptors. Evaluated on four standard benchmarks, our method significantly outperforms pure local approaches, achieves performance competitive with hierarchical methods, and reduces memory consumption by 50%. This demonstrates a balanced improvement in both accuracy and efficiency. The source code is publicly available.

Technology Category

Application Category

šŸ“ Abstract
Hierarchical methods represent state-of-the-art visual localization, optimizing search efficiency by using global descriptors to focus on relevant map regions. However, this state-of-the-art performance comes at the cost of substantial memory requirements, as all database images must be stored for feature matching. In contrast, direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space. We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework. This fusion rearranges the local descriptor space such that geographically nearby local descriptors are closer in the feature space according to the global descriptors. Therefore, the number of irrelevant competing descriptors decreases, specifically if they are geographically distant, thereby increasing the likelihood of correctly matching a query descriptor. We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements. Extensive experiments using various state-of-the-art local and global descriptors across four different datasets demonstrate the effectiveness of our approach. For the first time, our approach enables direct matching algorithms to benefit from global descriptors while maintaining memory efficiency. The code for this paper will be published at href{https://github.com/sontung/descriptor-disambiguation}{github.com/sontung/descriptor-disambiguation}.
Problem

Research questions and friction points this paper is trying to address.

Fusing global and local descriptors to reduce ambiguity in 2D-3D matching
Improving visual localization accuracy while reducing memory requirements
Enabling direct matching algorithms to benefit from global descriptors efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses local and global descriptors
Rearranges descriptor space geographically
Reduces memory usage and increases speed
šŸ”Ž Similar Papers
No similar papers found.
Son Tung Nguyen
Son Tung Nguyen
Queensland University of Technology
roboticscomputer visionvisual localization
A
Alejandro Fontan
QUT Centre for Robotics, School of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, QLD 4000, Australia
Michael Milford
Michael Milford
QUT Professor | Director, QUT Robotics Centre | ARC Laureate Fellow | Microsoft Fellow
Roboticscomputational neurosciencenavigationSLAMRatSLAM
T
Tobias Fischer
QUT Centre for Robotics, School of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, QLD 4000, Australia