Small Object Detection in Complex Backgrounds with Multi-Scale Attention and Global Relation Modeling

πŸ“… 2026-03-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenges of small object detection in complex backgrounds, where performance is hindered by feature degradation, weak semantic cues, and inaccurate localization. To overcome these limitations, we propose a multi-level feature enhancement framework with global relation modeling. Specifically, residual Haar wavelet downsampling is employed to preserve structural details, while a global relation module suppresses background interference. A cross-scale hybrid attention mechanism enables efficient multi-scale feature fusion, and a center-assisted loss function is introduced to refine localization accuracy. Evaluated on the RGBT-Tiny benchmark, the proposed method significantly outperforms existing state-of-the-art approaches, achieving superior performance in both IoU-based metrics and scale-adaptive evaluation criteria.

Technology Category

Application Category

πŸ“ Abstract
Small object detection under complex backgrounds remains a challenging task due to severe feature degradation, weak semantic representation, and inaccurate localization caused by downsampling operations and background interference. Existing detection frameworks are mainly designed for general objects and often fail to explicitly address the unique characteristics of small objects, such as limited structural cues and strong sensitivity to localization errors. In this paper, we propose a multi-level feature enhancement and global relation modeling framework tailored for small object detection. Specifically, a Residual Haar Wavelet Downsampling module is introduced to preserve fine-grained structural details by jointly exploiting spatial-domain convolutional features and frequency-domain representations. To enhance global semantic awareness and suppress background noise, a Global Relation Modeling module is employed to capture long-range dependencies at high-level feature stages. Furthermore, a Cross-Scale Hybrid Attention module is designed to establish sparse and aligned interactions across multi-scale features, enabling effective fusion of high-resolution details and high-level semantic information with reduced computational overhead. Finally, a Center-Assisted Loss is incorporated to stabilize training and improve localization accuracy for small objects. Extensive experiments conducted on the large-scale RGBT-Tiny benchmark demonstrate that the proposed method consistently outperforms existing state-of-the-art detectors under both IoU-based and scale-adaptive evaluation metrics. These results validate the effectiveness and robustness of the proposed framework for small object detection in complex environments.
Problem

Research questions and friction points this paper is trying to address.

Small Object Detection
Complex Backgrounds
Feature Degradation
Localization Accuracy
Semantic Representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Scale Attention
Global Relation Modeling
Haar Wavelet Downsampling
Small Object Detection
Cross-Scale Hybrid Attention
πŸ”Ž Similar Papers
No similar papers found.
W
Wenguang Tao
Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
X
Xiaotian Wang
Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
T
Tian Yan
Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
Yi Wang
Yi Wang
The Hong Kong Polytechnic University
Biomaterials
Jie Yan
Jie Yan
jieyan@amss.ac.cn
deep generative modelsclustering