LoD-Loc v3: Generalized Aerial Localization in Dense Cities using Instance Silhouette Alignment

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the poor generalization and frequent failure of aerial visual localization in dense urban environments, particularly in high-rise areas. To overcome these limitations, the authors propose the InsLoD-Loc framework, which advances the localization paradigm from semantic contour alignment to instance-level contour alignment by leveraging a large-scale synthetic instance segmentation dataset. They further design an alignment algorithm that matches instance contours with a low-detail urban 3D model. This approach enables zero-shot cross-scene generalization and substantially reduces pose estimation ambiguity without requiring real-world annotations. Experimental results demonstrate that InsLoD-Loc significantly outperforms current state-of-the-art methods in both cross-scene and high-density urban localization tasks, exhibiting exceptional robustness and generalization capability.

Technology Category

Application Category

📝 Abstract

We present LoD-Loc v3, a novel method for generalized aerial visual localization in dense urban environments. While prior work LoD-Loc v2 achieves localization through semantic building silhouette alignment with low-detail city models, it suffers from two key limitations: poor cross-scene generalization and frequent failure in dense building scenes. Our method addresses these challenges through two key innovations. First, we develop a new synthetic data generation pipeline that produces InsLoD-Loc - the largest instance segmentation dataset for aerial imagery to date, comprising 100k images with precise instance building annotations. This enables trained models to exhibit remarkable zero-shot generalization capability. Second, we reformulate the localization paradigm by shifting from semantic to instance silhouette alignment, which significantly reduces pose estimation ambiguity in dense scenes. Extensive experiments demonstrate that LoD-Loc v3 outperforms existing state-of-the-art (SOTA) baselines, achieving superior performance in both cross-scene and dense urban scenarios with a large margin. The project is available at https://nudt-sawlab.github.io/LoD-Locv3/.

Problem

Research questions and friction points this paper is trying to address.

aerial localization

dense urban environments

cross-scene generalization

building silhouette alignment

pose estimation ambiguity

Innovation

Methods, ideas, or system contributions that make the work stand out.

instance silhouette alignment

synthetic data generation

zero-shot generalization