π€ AI Summary
This work addresses the challenge of scale variation in aerial visual place recognition caused by changes in flight altitude. To tackle this issue, the authors propose the HE-VPR framework, which decouples altitude estimation from place recognition by leveraging a frozen DINOv2 backbone augmented with two lightweight side-branch adaptersβone for altitude-based partition retrieval and the other for place matching within the corresponding sub-database. The method further enhances scale invariance through a center-weighted masking strategy. By innovatively integrating altitude-aware mechanisms with a partitioned retrieval scheme, HE-VPR achieves a 6.1% improvement in Recall@1 and reduces memory consumption by 90% compared to existing ViT-based approaches, as demonstrated on a newly curated multi-altitude aerial dataset.
π Abstract
In this work, we propose HE-VPR, a visual place recognition (VPR) framework that incorporates height estimation. Our system decouples height inference from place recognition, allowing both modules to share a frozen DINOv2 backbone. Two lightweight bypass adapter branches are integrated into our system. The first estimates the height partition of the query image via retrieval from a compact height database, and the second performs VPR within the corresponding height-specific sub-database. The adaptation design reduces training cost and significantly decreases the search space of the database. We also adopt a center-weighted masking strategy to further enhance the robustness against scale differences. Experiments on two self-collected challenging multi-altitude datasets demonstrate that HE-VPR achieves up to 6.1\% Recall@1 improvement over state-of-the-art ViT-based baselines and reduces memory usage by up to 90\%. These results indicate that HE-VPR offers a scalable and efficient solution for height-aware aerial VPR, enabling practical deployment in GNSS-denied environments. All the code and datasets for this work have been released on https://github.com/hmf21/HE-VPR.