GSVisLoc: Generalizable Visual Localization for Gaussian Splatting Scene Representations

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses camera pose estimation for visual localization in scenes reconstructed using 3D Gaussian Splatting (3DGS), proposing the first general-purpose method that requires no model modification, retraining, or auxiliary reference images. The approach operates in three stages: (i) coarse-grained scene feature construction via downsampled 3D Gaussian encoding; (ii) fine-grained cross-modal feature matching guided by image patch encodings; and (iii) joint optimization for pose refinement. Its core innovation lies in the first end-to-end utilization of explicit 3DGS representations for visual localization—enabling direct, geometry-aware pose estimation without implicit neural rendering or feature distillation. The method demonstrates strong cross-scene generalization. Evaluated on standard indoor and outdoor benchmarks, it achieves significantly higher localization accuracy than existing 3DGS-based approaches, validating both its effectiveness and robustness under diverse real-world conditions.

Technology Category

Application Category

📝 Abstract
We introduce GSVisLoc, a visual localization method designed for 3D Gaussian Splatting (3DGS) scene representations. Given a 3DGS model of a scene and a query image, our goal is to estimate the camera's position and orientation. We accomplish this by robustly matching scene features to image features. Scene features are produced by downsampling and encoding the 3D Gaussians while image features are obtained by encoding image patches. Our algorithm proceeds in three steps, starting with coarse matching, then fine matching, and finally by applying pose refinement for an accurate final estimate. Importantly, our method leverages the explicit 3DGS scene representation for visual localization without requiring modifications, retraining, or additional reference images. We evaluate GSVisLoc on both indoor and outdoor scenes, demonstrating competitive localization performance on standard benchmarks while outperforming existing 3DGS-based baselines. Moreover, our approach generalizes effectively to novel scenes without additional training.
Problem

Research questions and friction points this paper is trying to address.

Estimating camera position and orientation from query images
Matching scene features to image features robustly
Generalizing visual localization to novel scenes without retraining
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalizable visual localization for Gaussian Splatting
Robust matching of encoded 3D Gaussian features
Three-step coarse-to-fine pose estimation refinement
🔎 Similar Papers
No similar papers found.
F
Fadi Khatib
Weizmann Institute of Science
D
Dror Moran
Weizmann Institute of Science
G
Guy Trostianetsky
Weizmann Institute of Science
Yoni Kasten
Yoni Kasten
NVIDIA Research
Computer Vision
M
Meirav Galun
Weizmann Institute of Science
Ronen Basri
Ronen Basri
Professor of Computer Science, Weizmann Institute of Science
Computer Vision