π€ AI Summary
This work addresses the limited performance of lightweight drone tracking models in complex scenarios, which stems from insufficient feature representation due to backbone simplification. To overcome this, we propose EATrack, a novel framework featuring a teacher-guided, spatially focused dual-branch distillation mechanism that integrates both feature-level and prediction-level knowledge transfer. EATrack further incorporates a fine-grained target-aware attention module and a temporal adaptive inference strategy to enhance the student modelβs discriminative power and robustness. Extensive experiments on five UAV tracking benchmarks demonstrate that EATrack significantly outperforms existing lightweight trackers while maintaining efficient inference, achieving an optimal balance between accuracy and speed.
π Abstract
Given the real-time demands of UAV tracking, many methods simplify the backbone to reduce computation, but this often weakens feature representation and degrades performance in complex scenarios. To alleviate this issue, we propose EATrack, an efficient and asymmetric UAV tracking framework centered around a teacher-guided dual-branch distillation strategy that enhances the feature expressiveness of the lightweight student model. Specifically, EATrack investigates two complementary perspectives of knowledge transfer: spatially focused feature-level distillation that compensates for weakened representations by guiding the student to learn strong target representations, and prediction-level distillation that enhances spatial localization by learning the teacher's capability for accurate target localization. Furthermore, to enhance robustness against appearance variations, we introduce a fine-grained target-aware distillation strategy that selectively transfers the teacher's target modeling capacity to the student. A temporal adaptation module is incorporated at inference to enhance robustness over time. Experiments on five UAV benchmarks demonstrate that EATrack achieves a favorable balance between accuracy and speed. Code: https://github.com/GXNU-ZhongLab/EATrack