🤖 AI Summary
This work addresses the challenge of cross-view geolocalization between oblique aerial images and orthorectified satellite maps, which is hindered by 3D geometric discrepancies such as building facades and scale variations. To tackle this, the authors propose a unified framework integrating Micro-scale Geometry-aware Self-Adaptation (MGSA) and Macro-scale Geometric Structure Filtering (MGSF). The MGSF module physically suppresses facade-induced interference while enhancing view-invariant planar features, complemented by a Geometry-Appearance Contrastive Distillation (GACD) loss to mitigate domain shift and occlusion effects. Evaluated on University-1652 and SUES-200, the method achieves state-of-the-art Recall@1 scores of 97.5% and 97.02%, respectively, demonstrating significant improvements over existing approaches and strong cross-dataset generalization capability.
📝 Abstract
Cross-view geo-localization (CVGL) is pivotal for GNSS-denied UAV navigation but remains brittle under the drastic geometric misalignment between oblique aerial views and orthographic satellite references. Existing methods predominantly operate within a 2D manifold, neglecting the underlying 3D geometry where view-dependent vertical facades (macro-structure) and scale variations (micro-scale) severely corrupt feature alignment. To bridge this gap, we propose (MGS)$^2$, a geometry-grounded framework. The core of our innovation is the Macro-Geometric Structure Filtering (MGSF) module. Unlike pixel-wise matching sensitive to noise, MGSF leverages dilated geometric gradients to physically filter out high-frequency facade artifacts while enhancing the view-invariant horizontal plane, directly addressing the domain shift. To guarantee robust input for this structural filtering, we explicitly incorporate a Micro-Geometric Scale Adaptation (MGSA) module. MGSA utilizes depth priors to dynamically rectify scale discrepancies via multi-branch feature fusion. Furthermore, a Geometric-Appearance Contrastive Distillation (GACD) loss is designed to strictly discriminate against oblique occlusions. Extensive experiments demonstrate that (MGS)$^2$ achieves state-of-the-art performance, recording a Recall@1 of 97.5\% on University-1652 and 97.02\% on SUES-200. Furthermore, the framework exhibits superior cross-dataset generalization against geometric ambiguity. The code is available at: \href{https://github.com/GabrielLi1473/MGS-Net}{https://github.com/GabrielLi1473/MGS-Net}.