🤖 AI Summary
To address the sensitivity of Transformer-based architectures to FPN noise and the consequent degradation of query quality—leading to sparse positive samples in small-object detection—this paper proposes a noise-robust query optimization framework. Methodologically, it integrates two core innovations: (1) NT-FPN, a noise-robust feature pyramid network that preserves semantic integrity across feature levels; and (2) PS-RPN, a hyperparameter-free region proposal network leveraging pairwise matching based on positional and shape similarity, substantially improving both the quantity and quality of positive queries. The framework deeply unifies enhanced FPN and RPN modules, fundamentally reformulating the query generation and label assignment paradigms in DETR-style detectors. Extensive experiments on COCO and VisDrone benchmarks demonstrate consistent and significant improvements over state-of-the-art methods, validating the framework’s effectiveness and strong generalization capability for small-object detection.
📝 Abstract
Despite advancements in Transformer-based detectors for small object detection (SOD), recent studies show that these detectors still face challenges due to inherent noise sensitivity in feature pyramid networks (FPN) and diminished query quality in existing label assignment strategies. In this paper, we propose a novel Noise-Resilient Query Optimization (NRQO) paradigm, which innovatively incorporates the Noise-Tolerance Feature Pyramid Network (NT-FPN) and the Pairwise-Similarity Region Proposal Network (PS-RPN). Specifically, NT-FPN mitigates noise during feature fusion in FPN by preserving spatial and semantic information integrity. Unlike existing label assignment strategies, PS-RPN generates a sufficient number of high-quality positive queries by enhancing anchor-ground truth matching through position and shape similarities, without the need for additional hyperparameters. Extensive experiments on multiple benchmarks consistently demonstrate the superiority of NRQO over state-of-the-art baselines.