๐ค AI Summary
This work proposes a fast autoregressive visual tracking framework to address the challenge of deploying high-performance trackers on resource-constrained devices, where existing methods suffer from slow inference speeds. The approach innovatively integrates task-specific self-distillation with inter-frame autoregressive sparsification, enabling efficient model compression and globally optimal token selection without relying on manually designed distillation pairs or incurring additional computational overhead. Evaluated on the GOT-10k benchmark, the method achieves an accuracy of 70.6% in average overlap (AO), while delivering remarkable inference speeds of 343 FPS on GPU and 121 FPS on CPU. These results demonstrate a significant balance between accuracy and efficiency, enabling real-time, high-performance visual tracking.
๐ Abstract
Inference speed and tracking performance are two critical evaluation metrics in the field of visual tracking. However, high-performance trackers often suffer from slow processing speeds, making them impractical for deployment on resource-constrained devices. To alleviate this issue, we propose FARTrack, a Fast Auto-Regressive Tracking framework. Since autoregression emphasizes the temporal nature of the trajectory sequence, it can maintain high performance while achieving efficient execution across various devices. FARTrack introduces Task-Specific Self-Distillation and Inter-frame Autoregressive Sparsification, designed from the perspectives of shallow-yet-accurate distillation and redundant-to-essential token optimization, respectively. Task-Specific Self-Distillation achieves model compression by distilling task-specific tokens layer by layer, enhancing the model's inference speed while avoiding suboptimal manual teacher-student layer pairs assignments. Meanwhile, Inter-frame Autoregressive Sparsification sequentially condenses multiple templates, avoiding additional runtime overhead while learning a temporally-global optimal sparsification strategy. FARTrack demonstrates outstanding speed and competitive performance. It delivers an AO of 70.6% on GOT-10k in real-time. Beyond, our fastest model achieves a speed of 343 FPS on the GPU and 121 FPS on the CPU.