🤖 AI Summary
Remote sensing imagery small-object detection faces challenges including high inter-class similarity, extreme foreground-background imbalance, and difficulty in model lightweighting. To address these, we propose RFASNet—a lightweight backbone network featuring multi-scale receptive field adaptive selection—along with a Foreground-Focusing Separation Module (FBSM) that filters background redundancy while enhancing foreground features. Additionally, we introduce a composite loss function, Weighted CIoU-Wasserstein (WCW), to jointly mitigate distribution shift and resolve conflicts between classification and localization optimization. Evaluated on DOTA v1.0 and NWPU VHR-10, our method achieves state-of-the-art accuracy with only 6.0M parameters and 52 FPS inference speed, significantly improving both robustness and real-time performance for small-object detection in remote sensing imagery.
📝 Abstract
Challenges in remote sensing object detection (RSOD), such as high inter-class similarity, imbalanced foreground-background distribution, and the small size of objects in remote sensing images significantly hinder detection accuracy. Moreo-ver, the trade-off between model accuracy and computational complexity poses additional constraints on the application of RSOD algorithms. To address these issues, this study proposes an efficient and lightweight RSOD algorithm integrat-ing multi-scale receptive fields and foreground focus mechanism, named RFWNet. Specifically, we proposed a lightweight backbone network Receptive Field Adaptive Selection Network (RFASNet), leveraging the rich context infor-mation of remote sensing images to enhance class separability. Additionally, we developed a Foreground Background Separation Module (FBSM) consisting of a background redundant information filtering module and a foreground information enhancement module to emphasize critical regions within images while filtering redundant background information. Finally, we designed a loss function, the Weighted CIoU-Wasserstein (WCW) loss, which weights the IoU-based loss by using the Normalized Wasserstein Distance to mitigate model sensitivity to small object position deviations. Experimental evaluations on the DOTA V1.0 and NWPU VHR-10 datasets demonstrate that RFWNet achieves advanced perfor-mance with 6.0M parameters and can achieves 52 FPS.