🤖 AI Summary
This work addresses the poor generalization and frequent failure of aerial visual localization in dense urban environments, particularly in high-rise areas. To overcome these limitations, the authors propose the InsLoD-Loc framework, which advances the localization paradigm from semantic contour alignment to instance-level contour alignment by leveraging a large-scale synthetic instance segmentation dataset. They further design an alignment algorithm that matches instance contours with a low-detail urban 3D model. This approach enables zero-shot cross-scene generalization and substantially reduces pose estimation ambiguity without requiring real-world annotations. Experimental results demonstrate that InsLoD-Loc significantly outperforms current state-of-the-art methods in both cross-scene and high-density urban localization tasks, exhibiting exceptional robustness and generalization capability.
📝 Abstract
We present LoD-Loc v3, a novel method for generalized aerial visual localization in dense urban environments. While prior work LoD-Loc v2 achieves localization through semantic building silhouette alignment with low-detail city models, it suffers from two key limitations: poor cross-scene generalization and frequent failure in dense building scenes. Our method addresses these challenges through two key innovations. First, we develop a new synthetic data generation pipeline that produces InsLoD-Loc - the largest instance segmentation dataset for aerial imagery to date, comprising 100k images with precise instance building annotations. This enables trained models to exhibit remarkable zero-shot generalization capability. Second, we reformulate the localization paradigm by shifting from semantic to instance silhouette alignment, which significantly reduces pose estimation ambiguity in dense scenes. Extensive experiments demonstrate that LoD-Loc v3 outperforms existing state-of-the-art (SOTA) baselines, achieving superior performance in both cross-scene and dense urban scenarios with a large margin. The project is available at https://nudt-sawlab.github.io/LoD-Locv3/.