LoD-Loc v3: Generalized Aerial Localization in Dense Cities using Instance Silhouette Alignment

📅 2026-03-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the poor generalization and frequent failure of aerial visual localization in dense urban environments, particularly in high-rise areas. To overcome these limitations, the authors propose the InsLoD-Loc framework, which advances the localization paradigm from semantic contour alignment to instance-level contour alignment by leveraging a large-scale synthetic instance segmentation dataset. They further design an alignment algorithm that matches instance contours with a low-detail urban 3D model. This approach enables zero-shot cross-scene generalization and substantially reduces pose estimation ambiguity without requiring real-world annotations. Experimental results demonstrate that InsLoD-Loc significantly outperforms current state-of-the-art methods in both cross-scene and high-density urban localization tasks, exhibiting exceptional robustness and generalization capability.

Technology Category

Application Category

📝 Abstract
We present LoD-Loc v3, a novel method for generalized aerial visual localization in dense urban environments. While prior work LoD-Loc v2 achieves localization through semantic building silhouette alignment with low-detail city models, it suffers from two key limitations: poor cross-scene generalization and frequent failure in dense building scenes. Our method addresses these challenges through two key innovations. First, we develop a new synthetic data generation pipeline that produces InsLoD-Loc - the largest instance segmentation dataset for aerial imagery to date, comprising 100k images with precise instance building annotations. This enables trained models to exhibit remarkable zero-shot generalization capability. Second, we reformulate the localization paradigm by shifting from semantic to instance silhouette alignment, which significantly reduces pose estimation ambiguity in dense scenes. Extensive experiments demonstrate that LoD-Loc v3 outperforms existing state-of-the-art (SOTA) baselines, achieving superior performance in both cross-scene and dense urban scenarios with a large margin. The project is available at https://nudt-sawlab.github.io/LoD-Locv3/.
Problem

Research questions and friction points this paper is trying to address.

aerial localization
dense urban environments
cross-scene generalization
building silhouette alignment
pose estimation ambiguity
Innovation

Methods, ideas, or system contributions that make the work stand out.

instance silhouette alignment
synthetic data generation
zero-shot generalization
aerial visual localization
dense urban environments
🔎 Similar Papers
No similar papers found.
S
Shuaibang Peng
National University of Defense Technology
J
Juelin Zhu
National University of Defense Technology
X
Xia Li
National University of Defense Technology
Kun Yang
Kun Yang
Northwest Polytechnical University Xi'an
3D reconstructionThermal Infrared ReconstructionHead Avatar for Video Conference
Maojun Zhang
Maojun Zhang
Zhejiang University
semantic communicationmachine learningwireless communicationAIGC
Y
Yu Liu
National University of Defense Technology
S
Shen Yan
National University of Defense Technology