Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization

📅 2024-12-11
🏛️ arXiv.org
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
In visual localization, existing end-to-end methods struggle to simultaneously achieve strong cross-scene generalization and high accuracy in absolute pose estimation. To address this, we propose a novel paradigm based on large-scale relative pose regression: we design a lightweight relative pose regression network coupled with a minimalist motion-averaging module, and—crucially—construct the first large-scale training dataset comprising 8 million posed image pairs. This enables efficient mapping from relative poses to globally consistent absolute poses. Our approach achieves real-time, high-accuracy absolute pose estimation across six public benchmarks, demonstrating superior inference speed, cross-dataset generalization, and robustness over state-of-the-art methods. The framework is scalable, robust, and practically deployable, establishing a new, effective paradigm for visual localization.

Technology Category

Application Category

📝 Abstract
Visual localization aims to determine the camera pose of a query image relative to a database of posed images. In recent years, deep neural networks that directly regress camera poses have gained popularity due to their fast inference capabilities. However, existing methods struggle to either generalize well to new scenes or provide accurate camera pose estimates. To address these issues, we present Reloc3r, a simple yet effective visual localization framework. It consists of an elegantly designed relative pose regression network, and a minimalist motion averaging module for absolute pose estimation. Trained on approximately eight million posed image pairs, Reloc3r achieves surprisingly good performance and generalization ability. We conduct extensive experiments on six public datasets, consistently demonstrating the effectiveness and efficiency of the proposed method. It provides high-quality camera pose estimates in real time and generalizes to novel scenes. Code: https://github.com/ffrivera0/reloc3r.
Problem

Research questions and friction points this paper is trying to address.

Improves generalization of camera pose regression in new scenes
Enhances accuracy of visual localization for query images
Enables real-time, high-quality camera pose estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Relative pose regression network for localization
Minimalist motion averaging for absolute pose
Large-scale training with eight million images
🔎 Similar Papers
No similar papers found.