🤖 AI Summary
Underwater object detection is significantly hindered by challenges such as light attenuation, color distortion, cluttered backgrounds, and small-scale targets. This work proposes the first integration of the Mamba state space model into this domain, enhancing multi-scale feature aggregation through the SPPELAN module and incorporating a Pyramid Split Attention (PSA) mechanism to improve feature discriminability. Built upon the YOLO framework, the proposed method efficiently models long-range dependencies and global context while maintaining computational efficiency. Evaluated on the URPC2022 dataset, the approach achieves a 4.9% improvement in mAP@0.5 over the YOLOv8n baseline, demonstrating particularly strong performance on small objects and in dense scenes without compromising inference speed.
📝 Abstract
Underwater object detection is a critical yet challenging research problem owing to severe light attenuation, color distortion, background clutter, and the small scale of underwater targets. To address these challenges, we propose SPMamba-YOLO, a novel underwater object detection network that integrates multi-scale feature enhancement with global context modeling. Specifically, a Spatial Pyramid Pooling Enhanced Layer Aggregation Network (SPPELAN) module is introduced to strengthen multi-scale feature aggregation and expand the receptive field, while a Pyramid Split Attention (PSA) mechanism enhances feature discrimination by emphasizing informative regions and suppressing background interference. In addition, a Mamba-based state space modeling module is incorporated to efficiently capture long-range dependencies and global contextual information, thereby improving detection robustness in complex underwater environments. Extensive experiments on the URPC2022 dataset demonstrate that SPMamba-YOLO outperforms the YOLOv8n baseline by more than 4.9\% in mAP@0.5, particularly for small and densely distributed underwater objects, while maintaining a favorable balance between detection accuracy and computational cost.