🤖 AI Summary
To address the scarcity of annotated data and poor model adaptability for vessel detection in complex inland waterway environments—characterized by narrow channels, variable weather conditions, and urban interference—this paper introduces MEIWVD, the first high-quality, multi-scenario inland vessel detection dataset comprising 32,478 images captured under diverse illumination and weather conditions (clear, rainy, foggy, and artificial lighting). We further propose a scene-guided image augmentation module and a parameter-constrained dilated convolution–multi-scale dilated residual fusion network, designed to enhance multi-scale feature representation and environmental adaptability under strict lightweight constraints. Evaluated on mainstream detectors including YOLO and FCOS, our method achieves a 6.2% mAP improvement over baseline models, significantly boosting robustness in challenging scenarios. This work establishes a new benchmark and provides a practical technical pathway for vision-based intelligent navigation in inland waterways.
📝 Abstract
The success of deep learning in intelligent ship visual perception relies heavily on rich image data. However, dedicated datasets for inland waterway vessels remain scarce, limiting the adaptability of visual perception systems in complex environments. Inland waterways, characterized by narrow channels, variable weather, and urban interference, pose significant challenges to object detection systems based on existing datasets. To address these issues, this paper introduces the Multi-environment Inland Waterway Vessel Dataset (MEIWVD), comprising 32,478 high-quality images from diverse scenarios, including sunny, rainy, foggy, and artificial lighting conditions. MEIWVD covers common vessel types in the Yangtze River Basin, emphasizing diversity, sample independence, environmental complexity, and multi-scale characteristics, making it a robust benchmark for vessel detection. Leveraging MEIWVD, this paper proposes a scene-guided image enhancement module to improve water surface images based on environmental conditions adaptively. Additionally, a parameter-limited dilated convolution enhances the representation of vessel features, while a multi-scale dilated residual fusion method integrates multi-scale features for better detection. Experiments show that MEIWVD provides a more rigorous benchmark for object detection algorithms, and the proposed methods significantly improve detector performance, especially in complex multi-environment scenarios.