🤖 AI Summary
This work addresses the limitation of existing cross-view geolocalization methods, which are confined to single-target scenarios and struggle to meet the multi-object requirements of real-world applications. We introduce, for the first time, the Cross-View Multi-Object Geolocalization (CVMOGL) task, establish a benchmark dataset comprising CMLocation-V1 and CMLocation-V2, and propose the MOGeo algorithm to enable accurate cross-view matching and localization of multiple targets simultaneously. Experimental results demonstrate that MOGeo performs effectively across diverse scenarios, substantially advancing beyond the conventional one-to-one matching paradigm and facilitating practical deployment. Furthermore, our study highlights persistent challenges in current approaches, thereby laying a foundational framework for future research in this emerging direction.
📝 Abstract
Cross-View Object Geo-Localization (CVOGL) aims to locate an object of interest in a query image within a corresponding satellite image. Existing methods typically assume that the query image contains only a single object, which does not align with the complex, multi-object geo-localization requirements in real-world applications, making them unsuitable for practical scenarios. To bridge the gap between the realistic setting and existing task, we propose a new task, called Cross-View Multi-Object Geo-Localization (CVMOGL). To advance the CVMOGL task, we first construct a benchmark, CMLocation, which includes two datasets: CMLocation-V1 and CMLocation-V2. Furthermore, we propose a novel cross-view multi-object geo-localization method, MOGeo, and benchmark it against existing state-of-the-art methods. Extensive experiments are conducted under various application scenarios to validate the effectiveness of our method. The results demonstrate that cross-view object geo-localization in the more realistic setting remains a challenging problem, encouraging further research in this area.