Object Detection as an Optional Basis: A Graph Matching Network for Cross-View UAV Localization

📅 2025-11-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Accurate visual localization of unmanned aerial vehicles (UAVs) remains challenging in GNSS-denied environments, particularly due to significant cross-domain discrepancies between satellite imagery and UAV-captured aerial images—including large spatiotemporal gaps, drastic viewpoint differences, and heterogeneous modalities (e.g., visible-light vs. infrared). Method: This paper proposes a cross-view image matching localization framework that first employs an advanced object detector to extract multi-scale salient instances, constructing a fine-grained object-level graph. A dedicated graph neural network (GNN) is then designed to jointly model intra- and inter-node relationships, coupled with a learnable node similarity metric that explicitly mitigates modality discrepancies. Contribution/Results: Extensive experiments on public and real-world datasets demonstrate substantial improvements in cross-domain image retrieval and geolocalization accuracy. The method exhibits strong robustness and generalization even under large modality gaps, establishing a novel paradigm for precise UAV localization without GNSS support.

Technology Category

Application Category

📝 Abstract

With the rapid growth of the low-altitude economy, UAVs have become crucial for measurement and tracking in patrol systems. However, in GNSS-denied areas, satellite-based localization methods are prone to failure. This paper presents a cross-view UAV localization framework that performs map matching via object detection, aimed at effectively addressing cross-temporal, cross-view, heterogeneous aerial image matching. In typical pipelines, UAV visual localization is formulated as an image-retrieval problem: features are extracted to build a localization map, and the pose of a query image is estimated by matching it to a reference database with known poses. Because publicly available UAV localization datasets are limited, many approaches recast localization as a classification task and rely on scene labels in these datasets to ensure accuracy. Other methods seek to reduce cross-domain differences using polar-coordinate reprojection, perspective transformations, or generative adversarial networks; however, they can suffer from misalignment, content loss, and limited realism. In contrast, we leverage modern object detection to accurately extract salient instances from UAV and satellite images, and integrate a graph neural network to reason about inter-image and intra-image node relationships. Using a fine-grained, graph-based node-similarity metric, our method achieves strong retrieval and localization performance. Extensive experiments on public and real-world datasets show that our approach handles heterogeneous appearance differences effectively and generalizes well, making it applicable to scenarios with larger modality gaps, such as infrared-visible image matching. Our dataset will be publicly available at the following URL: https://github.com/liutao23/ODGNNLoc.git.

Problem

Research questions and friction points this paper is trying to address.

Addresses UAV localization failure in GNSS-denied environments

Solves cross-view aerial image matching with appearance differences

Handles heterogeneous modality gaps in visual localization systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Object detection extracts salient instances from images

Graph neural network models inter-image node relationships

Fine-grained graph-based metric achieves strong localization performance

🔎 Similar Papers

No similar papers found.

Authors to Follow