🤖 AI Summary
Cross-view geolocalization suffers from robustness degradation due to GPS-induced pose drift, resulting in partial (noisy) correspondences between ground- and aerial-view image pairs. This noisy correspondence problem remains formally unaddressed.
Method: We first formalize this challenge and propose an uncertainty-aware co-enhancement and evidential co-training framework. Our approach integrates uncertainty estimation, loss discrepancy analysis, co-augmentation, feature-space reweighting, and evidential deep learning to enable fine-grained identification and selective enhancement of noisy samples.
Contribution/Results: Extensive experiments across varying noise ratios demonstrate consistent and significant improvements over state-of-the-art methods. Real-world evaluations confirm enhanced stability and accuracy in cross-view matching. The framework establishes a novel paradigm for UAV navigation and remote sensing localization, advancing robustness under imperfect geo-registration.
📝 Abstract
Cross-view geo-localization is a critical task for UAV navigation, event detection, and aerial surveying, as it enables matching between drone-captured and satellite imagery. Most existing approaches embed multi-modal data into a joint feature space to maximize the similarity of paired images. However, these methods typically assume perfect alignment of image pairs during training, which rarely holds true in real-world scenarios. In practice, factors such as urban canyon effects, electromagnetic interference, and adverse weather frequently induce GPS drift, resulting in systematic alignment shifts where only partial correspondences exist between pairs. Despite its prevalence, this source of noisy correspondence has received limited attention in current research. In this paper, we formally introduce and address the Noisy Correspondence on Cross-View Geo-Localization (NC-CVGL) problem, aiming to bridge the gap between idealized benchmarks and practical applications. To this end, we propose PAUL (Partition and Augmentation by Uncertainty Learning), a novel framework that partitions and augments training data based on estimated data uncertainty through uncertainty-aware co-augmentation and evidential co-training. Specifically, PAUL selectively augments regions with high correspondence confidence and utilizes uncertainty estimation to refine feature learning, effectively suppressing noise from misaligned pairs. Distinct from traditional filtering or label correction, PAUL leverages both data uncertainty and loss discrepancy for targeted partitioning and augmentation, thus providing robust supervision for noisy samples. Comprehensive experiments validate the effectiveness of individual components in PAUL,which consistently achieves superior performance over other competitive noisy-correspondence-driven methods in various noise ratios.