Cross-View Geo-Localization with Street-View and VHR Satellite Imagery in Decentrality Settings

📅 2024-12-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In GNSS-denied scenarios (e.g., disaster response, urban canyons), cross-view geo-localization suffers severe performance degradation due to extreme viewpoint disparity—specifically, significant decentrality—between street-level and satellite imagery. To address this, we propose AuxGeo: a novel framework featuring (i) the first large-scale DReSS benchmark explicitly designed for decentrality-aware evaluation; (ii) a Bird’s-eye-view Intermediate Module (BIM) that enables cross-view feature alignment and geometric disentanglement; (iii) a Position Constraint Module (PCM) that fuses dual-stream encoded features under explicit geometric priors; and (iv) multi-task metric learning jointly optimizing semantic and geometric consistency. Evaluated on DReSS, AuxGeo substantially mitigates accuracy loss induced by large decentrality. Moreover, it achieves state-of-the-art performance on three major benchmarks—CVUSA, CVACT, and VIGOR—demonstrating robust generalization across diverse cross-view localization settings.

Technology Category

Application Category

📝 Abstract
Cross-View Geo-Localization tackles the challenge of image geo-localization in GNSS-denied environments, including disaster response scenarios, urban canyons, and dense forests, by matching street-view query images with geo-tagged aerial-view reference images. However, current research often relies on benchmarks and methods that assume center-aligned settings or account for only limited decentrality, which we define as the offset of the query image relative to the reference image center. Such assumptions fail to reflect real-world scenarios, where reference databases are typically pre-established without the possibility of ensuring perfect alignment for each query image. Moreover, decentrality is a critical factor warranting deeper investigation, as larger decentrality can substantially improve localization efficiency but comes at the cost of declines in localization accuracy. To address this limitation, we introduce DReSS (Decentrality Related Street-view and Satellite-view dataset), a novel dataset designed to evaluate cross-view geo-localization with a large geographic scope and diverse landscapes, emphasizing the decentrality issue. Meanwhile, we propose AuxGeo (Auxiliary Enhanced Geo-Localization) to further study the decentrality issue, which leverages a multi-metric optimization strategy with two novel modules: the Bird's-eye view Intermediary Module (BIM) and the Position Constraint Module (PCM). These modules improve the localization accuracy despite the decentrality problem. Extensive experiments demonstrate that AuxGeo outperforms previous methods on our proposed DReSS dataset, mitigating the issue of large decentrality, and also achieves state-of-the-art performance on existing public datasets such as CVUSA, CVACT, and VIGOR.
Problem

Research questions and friction points this paper is trying to address.

Cross-perspective Geolocalization
High Eccentricity
Alignment Discrepancy
Innovation

Methods, ideas, or system contributions that make the work stand out.

DReSS Dataset
AuxGeo System
Cross-View Geolocalization
🔎 Similar Papers
No similar papers found.
P
Panwang Xia
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, 430079, Hubei, China
L
Lei Yu
Ant Group, Hangzhou, 310023, Zhejiang, China
Yi Wan
Yi Wan
Pokee AI
reinforcement learning
Q
Qiong Wu
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, 430079, Hubei, China
P
Peiqi Chen
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, 430079, Hubei, China
Liheng Zhong
Liheng Zhong
Descartes Labs
Remote sensingAgriculture
Y
Yongxiang Yao
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, 430079, Hubei, China
D
Dong Wei
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, 430079, Hubei, China
Xinyi Liu
Xinyi Liu
Wuhan University
3D ReconstructionPoint Cloud and Image IntegrationComputational Origami
Lixiang Ru
Lixiang Ru
Ant Group
computer visionMLLMmulti-modal learningremote sensing
Y
Yingying Zhang
Ant Group, Hangzhou, 310023, Zhejiang, China
Jiangwei Lao
Jiangwei Lao
Ant Group
Computer Vision
J
Jingdong Chen
Ant Group, Hangzhou, 310023, Zhejiang, China
M
Ming Yang
Ant Group, Hangzhou, 310023, Zhejiang, China
Y
Yongjun Zhang
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, 430079, Hubei, China; Technology Innovation Center for Collaborative Applications of Natural Resources Data in GBA, Ministry of Natural Resources, Guangzhou, 510075, Guangdong, China