๐ค AI Summary
The proliferation of small unmanned aerial vehicles (UAVs) poses escalating public safety risks, yet existing detection systems suffer from large physical footprints, high costs, and limited accuracy in trajectory estimation and UAV-type classification. To address these challenges, this paper proposes a lightweight, audio-driven counter-UAV detection framework. Methodologically, it introduces a novel parallel selective state-space model (Mamba) architecture that fuses timeโfrequency acoustic features; incorporates a residual cross-attention mechanism to enhance temporal modeling; and jointly performs 3D acoustic source localization and fine-grained UAV-type classification. Evaluated on the MMUAD benchmark, the method achieves state-of-the-art performance: trajectory estimation error is reduced by 21.3%, and UAV-type classification accuracy reaches 94.7%. The source code and trained models are publicly released.
๐ Abstract
The increasing prevalence of compact UAVs has introduced significant risks to public safety, while traditional drone detection systems are often bulky and costly. To address these challenges, we present TAME, the Temporal Audio-based Mamba for Enhanced Drone Trajectory Estimation and Classification. This innovative anti-UAV detection model leverages a parallel selective state-space model to simultaneously capture and learn both the temporal and spectral features of audio, effectively analyzing propagation of sound. To further enhance temporal features, we introduce a Temporal Feature Enhancement Module, which integrates spectral features into temporal data using residual cross-attention. This enhanced temporal information is then employed for precise 3D trajectory estimation and classification. Our model sets a new standard of performance on the MMUAD benchmarks, demonstrating superior accuracy and effectiveness. The code and trained models are publicly available on GitHub url{https://github.com/AmazingDay1/TAME}.