GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Global image geolocalization faces challenges due to substantial visual regional variation and the difficulty of regressing precise GPS coordinates. Existing two-stage retrieval methods rely on pointwise supervision and simplistic similarity metrics, neglecting spatial structural relationships among candidate locations. This paper proposes a distance-aware ranking framework, the first to formulate geolocalization as a cross-modal ranking task under multi-order distance constraints. We design a multi-order distance loss that jointly optimizes both absolute and relative geographic distance ranking. We introduce GeoRanking—the first multimodal dataset explicitly constructed for geographic ranking—and employ large vision-language models for joint query-candidate encoding, integrating pointwise supervision with structured spatial supervision. Our approach achieves state-of-the-art performance on IM2GPS3K and YFCC4K, significantly outperforming prior methods.

Technology Category

Application Category

📝 Abstract
Worldwide image geolocalization-the task of predicting GPS coordinates from images taken anywhere on Earth-poses a fundamental challenge due to the vast diversity in visual content across regions. While recent approaches adopt a two-stage pipeline of retrieving candidates and selecting the best match, they typically rely on simplistic similarity heuristics and point-wise supervision, failing to model spatial relationships among candidates. In this paper, we propose GeoRanker, a distance-aware ranking framework that leverages large vision-language models to jointly encode query-candidate interactions and predict geographic proximity. In addition, we introduce a multi-order distance loss that ranks both absolute and relative distances, enabling the model to reason over structured spatial relationships. To support this, we curate GeoRanking, the first dataset explicitly designed for geographic ranking tasks with multimodal candidate information. GeoRanker achieves state-of-the-art results on two well-established benchmarks (IM2GPS3K and YFCC4K), significantly outperforming current best methods.
Problem

Research questions and friction points this paper is trying to address.

Predicting GPS coordinates from global images
Modeling spatial relationships among candidate locations
Improving geolocalization accuracy with distance-aware ranking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distance-aware ranking framework for geolocalization
Leverages vision-language models for query-candidate encoding
Multi-order distance loss for spatial relationship reasoning
🔎 Similar Papers
No similar papers found.
Pengyue Jia
Pengyue Jia
PhD candidate of Data Science, City University of Hong Kong
Information RetrievalLarge Language ModelsGeoAI
Seongheon Park
Seongheon Park
University of Wisconsin-Madison
Machine LearningReliable AI
S
Song Gao
Department of Computer Sciences, University of Wisconsin-Madison
X
Xiangyu Zhao
Department of Data Science, City University of Hong Kong
Y
Yixuan Li
Department of Computer Sciences, University of Wisconsin-Madison