Detector-Augmented SAMURAI for Long-Duration Drone Tracking

๐Ÿ“… 2026-01-08
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the vulnerability of existing RGB-based UAV tracking methods to detection failures in long-term urban surveillance scenarios, which often leads to temporal inconsistency and tracking breakdown. To enhance robustness against bounding box initialization errors and varying sequence lengths, the study introduces SAMURAIโ€”a foundational modelโ€”into UAV tracking for the first time, proposing an augmented architecture that effectively integrates detector outputs. The approach significantly improves zero-shot tracking performance, particularly excelling in challenging conditions involving target exit-and-reentry and extended sequences. Extensive experiments demonstrate its superiority, achieving up to a 0.393 increase in success rate and a 0.475 reduction in miss rate across multiple datasets, thereby validating its effectiveness and state-of-the-art performance in complex urban environments.

Technology Category

Application Category

๐Ÿ“ Abstract
Robust long-term tracking of drone is a critical requirement for modern surveillance systems, given their increasing threat potential. While detector-based approaches typically achieve strong frame-level accuracy, they often suffer from temporal inconsistencies caused by frequent detection dropouts. Despite its practical relevance, research on RGB-based drone tracking is still limited and largely reliant on conventional motion models. Meanwhile, foundation models like SAMURAI have established their effectiveness across other domains, exhibiting strong category-agnostic tracking performance. However, their applicability in drone-specific scenarios has not been investigated yet. Motivated by this gap, we present the first systematic evaluation of SAMURAI's potential for robust drone tracking in urban surveillance settings. Furthermore, we introduce a detector-augmented extension of SAMURAI to mitigate sensitivity to bounding-box initialization and sequence length. Our findings demonstrate that the proposed extension significantly improves robustness in complex urban environments, with pronounced benefits in long-duration sequences - especially under drone exit-re-entry events. The incorporation of detector cues yields consistent gains over SAMURAI's zero-shot performance across datasets and metrics, with success rate improvements of up to +0.393 and FNR reductions of up to -0.475.
Problem

Research questions and friction points this paper is trying to address.

drone tracking
long-term tracking
temporal inconsistency
urban surveillance
bounding-box initialization
Innovation

Methods, ideas, or system contributions that make the work stand out.

detector-augmented tracking
SAMURAI
long-duration drone tracking
urban surveillance
foundation model
๐Ÿ”Ž Similar Papers
No similar papers found.
T
Tamara R. Lenhard
Institute for the Protection of Terrestrial Infrastructures, German Aerospace Center (DLR), Sankt Augustin, Germany
Andreas Weinmann
Andreas Weinmann
Hochschule Darmstadt
Computer VisionImagingData Analysis
Hichem Snoussi
Hichem Snoussi
Professor of Signal and Image Processing, University of Technology of Troyes
Video ProcessingWireless Sensor NetworksArtificial IntelligenceStatistical Signal Processing
T
Tobias Koch
Institute for the Protection of Terrestrial Infrastructures, German Aerospace Center (DLR), Sankt Augustin, Germany