Efficient Feature Fusion for UAV Object Detection

📅 2025-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Small-object detection in UAV remote sensing imagery faces significant challenges—including low-quality imaging, complex backgrounds, and severe occlusion—exacerbated by insufficient multi-scale feature fusion in existing methods, which compromises both localization accuracy and classification robustness. To address this, we propose an efficient multi-scale feature fusion framework featuring a novel hybrid-resolution adaptive fusion mechanism. This mechanism integrates learnable hybrid up/down-sampling, cross-layer skip connections, and multi-scale feature reweighting to enable cross-layer feature alignment and collaborative optimization across arbitrary resolutions. Embedded into the YOLOv10 architecture without increasing model parameters, our method achieves a 2% improvement in small-object AP on two public UAV benchmarks, significantly enhancing detection recall and localization precision while preserving computational efficiency and model lightweightness.

Technology Category

Application Category

📝 Abstract
Object detection in unmanned aerial vehicle (UAV) remote sensing images poses significant challenges due to unstable image quality, small object sizes, complex backgrounds, and environmental occlusions. Small objects, in particular, occupy minimal portions of images, making their accurate detection highly difficult. Existing multi-scale feature fusion methods address these challenges to some extent by aggregating features across different resolutions. However, these methods often fail to effectively balance classification and localization performance for small objects, primarily due to insufficient feature representation and imbalanced network information flow. In this paper, we propose a novel feature fusion framework specifically designed for UAV object detection tasks to enhance both localization accuracy and classification performance. The proposed framework integrates hybrid upsampling and downsampling modules, enabling feature maps from different network depths to be flexibly adjusted to arbitrary resolutions. This design facilitates cross-layer connections and multi-scale feature fusion, ensuring improved representation of small objects. Our approach leverages hybrid downsampling to enhance fine-grained feature representation, improving spatial localization of small targets, even under complex conditions. Simultaneously, the upsampling module aggregates global contextual information, optimizing feature consistency across scales and enhancing classification robustness in cluttered scenes. Experimental results on two public UAV datasets demonstrate the effectiveness of the proposed framework. Integrated into the YOLO-V10 model, our method achieves a 2% improvement in average precision (AP) compared to the baseline YOLO-V10 model, while maintaining the same number of parameters. These results highlight the potential of our framework for accurate and efficient UAV object detection.
Problem

Research questions and friction points this paper is trying to address.

Drone Imaging
Small Object Detection
Feature Representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Drone Imagery
Multi-scale Feature Fusion
Small Object Detection
🔎 Similar Papers
No similar papers found.
X
Xudong Wang
School of Computer Science, East China Normal University, Shanghai, China
Chaomin Shen
Chaomin Shen
Dept of Computer Science, East China Normal University
Image ProcessingMachine Learning
Y
Yaxin Peng
Department of Mathematics, College of Sciences, Shanghai University, Shanghai, China