๐ค AI Summary
This study addresses the challenge of extreme label scarcity in sonar image object detection, stemming from sparse textures and high noise levels that complicate annotation. To tackle this issue, the authors propose RSOD, a semi-supervised object detection method based on a teacher-student framework. RSOD leverages multi-view consistency to compute reliability scores for pseudo-labels and integrates an object-mixing pseudo-labeling strategy with reliability-guided adaptive constraints, effectively combining limited labeled data with abundant unlabeled data. Evaluated on the UATD dataset, the method achieves performance comparable to a fully supervised baseline trained with 100% labeled data while using only 5% of the annotations. Additionally, the authors introduce a new sonar image dataset to support future research in this domain.
๐ Abstract
Object detection in sonar images is a key technology in underwater detection systems. Compared to natural images, sonar images contain fewer texture details and are more susceptible to noise, making it difficult for non-experts to distinguish subtle differences between classes. This leads to their inability to provide precise annotation data for sonar images. Therefore, designing effective object detection methods for sonar images with extremely limited labels is particularly important. To address this, we propose a teacher-student framework called RSOD, which aims to fully learn the characteristics of sonar images and develop a pseudo-label strategy suitable for these images to mitigate the impact of limited labels. First, RSOD calculates a reliability score by assessing the consistency of the teacher's predictions across different views. To leverage this score, we introduce an object mixed pseudo-label method to tackle the shortage of labeled data in sonar images. Finally, we optimize the performance of the student by implementing a reliability-guided adaptive constraint. By taking full advantage of unlabeled data, the student can perform well even in situations with extremely limited labels. Notably, on the UATD dataset, our method, using only 5% of labeled data, achieves results that can compete against those of our baseline algorithm trained on 100% labeled data. We also collected a new dataset to provide more valuable data for research in the field of sonar.