GeoLink: A 3D-Aware Framework Towards Better Generalization in Cross-View Geo-Localization

📅 2026-04-14

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

This work addresses the challenges of cross-view geolocalization in unseen regions and complex conditions, where viewpoint variations and domain shifts lead to semantic inconsistency and limited generalization. To overcome these issues, the authors propose GeoLink, a framework that leverages offline multi-view image reconstruction to generate scene point clouds, thereby establishing a 3D structural prior. GeoLink integrates this 3D geometry into 2D feature learning through a geometric-aware semantic refinement module and a unified view relation distillation mechanism. This design enhances semantic consistency while preserving the computational efficiency of purely 2D inference. Extensive experiments demonstrate that GeoLink significantly outperforms existing methods across multiple benchmark datasets, exhibiting superior cross-domain generalization under diverse weather conditions and in previously unseen domains.

Technology Category

Application Category

📝 Abstract

Generalizable cross-view geo-localization aims to match the same location across views in unseen regions and conditions without GPS supervision. Its core difficulty lies in severe semantic inconsistency caused by viewpoint variation and poor generalization under domain shift. Existing methods mainly rely on 2D correspondence, but they are easily distracted by redundant shared information across views, leading to less transferable representations. To address this, we propose GeoLink, a 3D-aware semantic-consistent framework for Generalizable cross-view geo-localization. Specifically, we offline reconstruct scene point clouds from multi-view drone images using VGGT, providing stable structural priors. Based on these 3D anchors, we improve 2D representation learning in two complementary ways. A Geometric-aware Semantic Refinement module mitigates potentially redundant and view-biased dependencies in 2D features under 3D guidance. In addition, a Unified View Relation Distillation module transfers 3D structural relations to 2D features, improving cross-view alignment while preserving a 2D-only inference pipeline. Extensive experiments on multiple benchmarks show that GeoLink consistently outperforms state-of-the-art methods and achieves superior generalization across unseen domains and diverse weather environments.

Problem

Research questions and friction points this paper is trying to address.

cross-view geo-localization

generalization

semantic inconsistency

domain shift

3D-aware

Innovation

Methods, ideas, or system contributions that make the work stand out.

3D-aware

cross-view geo-localization

generalizable representation