Dynamic Semantic-Aware Correlation Modeling for UAV Tracking

📅 2025-10-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing UAV tracking methods prioritize speed but neglect semantic awareness, leading to inaccurate localization feature extraction between template and search regions—especially under challenges such as camera motion, rapid target movement, and low-resolution input. To address this, we propose a Dynamic Semantic-Aware Correlation Modeling framework. First, we design a Transformer-driven Correlation Map Enhancement module to strengthen the search region’s response to semantically critical template features. Second, we introduce a Dynamic Semantic Correlation Generator that jointly optimizes semantic understanding and correlation modeling. Third, we employ structured pruning to construct multiple model variants, enabling flexible trade-offs between accuracy and efficiency. Our approach achieves significant improvements in both accuracy and robustness across multiple UAV tracking benchmarks. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
UAV tracking can be widely applied in scenarios such as disaster rescue, environmental monitoring, and logistics transportation. However, existing UAV tracking methods predominantly emphasize speed and lack exploration in semantic awareness, which hinders the search region from extracting accurate localization information from the template. The limitation results in suboptimal performance under typical UAV tracking challenges such as camera motion, fast motion, and low resolution, etc. To address this issue, we propose a dynamic semantic aware correlation modeling tracking framework. The core of our framework is a Dynamic Semantic Relevance Generator, which, in combination with the correlation map from the Transformer, explore semantic relevance. The approach enhances the search region's ability to extract important information from the template, improving accuracy and robustness under the aforementioned challenges. Additionally, to enhance the tracking speed, we design a pruning method for the proposed framework. Therefore, we present multiple model variants that achieve trade-offs between speed and accuracy, enabling flexible deployment according to the available computational resources. Experimental results validate the effectiveness of our method, achieving competitive performance on multiple UAV tracking datasets. The code is available at https://github.com/zxyyxzz/DSATrack.
Problem

Research questions and friction points this paper is trying to address.

Improves UAV tracking accuracy under camera motion
Enhances semantic relevance modeling for better localization
Balances tracking speed and accuracy through pruning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Semantic Relevance Generator for correlation modeling
Transformer-based semantic relevance exploration for tracking
Pruning method enables speed-accuracy trade-off optimization
🔎 Similar Papers
No similar papers found.
X
Xinyu Zhou
College of Computer Science and Artificial Intelligence, Fudan University
T
Tongxin Pan
College of Computer Science and Artificial Intelligence, Fudan University
Lingyi Hong
Lingyi Hong
Fudan University
Computer Vision
Pinxue Guo
Pinxue Guo
Fudan University
Multimodal LLMVideo UnderstandingTracking and Segmentation
H
Haijing Guo
College of Computer Science and Artificial Intelligence, Fudan University
Zhaoyu Chen
Zhaoyu Chen
TikTok
AI SecurityTrustworthy AIMultimodal AIGenerative AI
Kaixun Jiang
Kaixun Jiang
Fudan University
Computer VisionAdversarial Examples
W
Wenqiang Zhang
College of Computer Science and Artificial Intelligence, Fudan University; College of Intelligent Robotics and Advanced Manufacturing, Fudan University