🤖 AI Summary
This work addresses the limitation of existing cross-view geolocalization methods, which rely on training with specific fields of view (FoV) and suffer significant performance degradation under unknown FoVs or orientations, often necessitating multiple models. To overcome this, we propose SinGeo, a novel framework that introduces curriculum learning to cross-view geolocalization for the first time. SinGeo enhances intra-class separability between ground and satellite images through a dual-branch discriminative learning scheme and aligns cross-view features to achieve robust localization across diverse FoVs within a single model. Notably, our approach requires no additional modules or explicit geometric transformations. We also design a consistency evaluation mechanism to quantify model stability. Extensive experiments on four benchmark datasets demonstrate that SinGeo achieves state-of-the-art performance, substantially outperforming methods specifically designed for extreme FoVs, while exhibiting strong cross-architecture transferability.
📝 Abstract
Robust cross-view geo-localization (CVGL) remains challenging despite the surge in recent progress. Existing methods still rely on field-of-view (FoV)-specific training paradigms, where models are optimized under a fixed FoV but collapse when tested on unseen FoVs and unknown orientations. This limitation necessitates deploying multiple models to cover diverse variations. Although studies have explored dynamic FoV training by simply randomizing FoVs, they failed to achieve robustness across diverse conditions -- implicitly assuming all FoVs are equally difficult. To address this gap, we present SinGeo, a simple yet powerful framework that enables a single model to realize robust cross-view geo-localization without additional modules or explicit transformations. SinGeo employs a dual discriminative learning architecture that enhances intra-view discriminability within both ground and satellite branches, and is the first to introduce a curriculum learning strategy to achieve robust CVGL. Extensive evaluations on four benchmark datasets reveal that SinGeo sets state-of-the-art (SOTA) results under diverse conditions, and notably outperforms methods specifically trained for extreme FoVs. Beyond superior performance, SinGeo also exhibits cross-architecture transferability. Furthermore, we propose a consistency evaluation method to quantitatively assess model stability under varying views, providing an explainable perspective for understanding and advancing robustness in future CVGL research. Codes will be available upon acceptance.