🤖 AI Summary
To address inefficiency and high computational overhead in small-object detection—caused by redundant feature utilization and rigid query assignment—this paper proposes a density-guided feature-query coordination mechanism. Specifically, we design a Density-Focused Extractor (DeFE) for foreground-sensitive feature enhancement; introduce Masked Window Sparse Attention (MWAS) to reduce Transformer complexity; and propose Progressive Adaptive Query Initialization (PAQI), which dynamically generates high-quality queries based on spatial target density. Integrated into a lightweight DETR architecture, our approach jointly leverages foreground clustering masks, sparse attention, and spatially adaptive density modulation. Experiments on AI-TOD-V2 and VisDrone demonstrate AP improvements of +3.3 and +2.5, respectively, achieving state-of-the-art accuracy while maintaining low FLOPs and a compact model size (<30M parameters).
📝 Abstract
Tiny object detection plays a vital role in drone surveillance, remote sensing, and autonomous systems, enabling the identification of small targets across vast landscapes. However, existing methods suffer from inefficient feature leverage and high computational costs due to redundant feature processing and rigid query allocation. To address these challenges, we propose Dome-DETR, a novel framework with Density-Oriented Feature-Query Manipulation for Efficient Tiny Object Detection. To reduce feature redundancies, we introduce a lightweight Density-Focal Extractor (DeFE) to produce clustered compact foreground masks. Leveraging these masks, we incorporate Masked Window Attention Sparsification (MWAS) to focus computational resources on the most informative regions via sparse attention. Besides, we propose Progressive Adaptive Query Initialization (PAQI), which adaptively modulates query density across spatial areas for better query allocation. Extensive experiments demonstrate that Dome-DETR achieves state-of-the-art performance (+3.3 AP on AI-TOD-V2 and +2.5 AP on VisDrone) while maintaining low computational complexity and a compact model size. Code will be released upon acceptance.