Closer to Ground Truth: Realistic Shape and Appearance Labeled Data Generation for Unsupervised Underwater Image Segmentation

📅 2025-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Low visibility and scarce annotated data hinder underwater video fish segmentation. To address this, we propose a two-stage unsupervised framework: first, high-fidelity virtual fish are synthesized into real underwater scenes via thin-plate spline deformation and histogram matching; second, we introduce DeepSalmon—the first large-scale, salmon-specific underwater dataset (30 GB)—and pioneer the “synthetic–real hybrid data generation” paradigm. Our unsupervised domain adaptation segmentation framework achieves state-of-the-art performance among unsupervised methods on both DeepFish and DeepSalmon, closely approaching fully supervised SOTA; furthermore, pretraining on our synthetic data boosts downstream fully supervised model accuracy. Key contributions include (1) a realistic, annotation-free data synthesis mechanism leveraging geometric and photometric modeling, and (2) the construction of the first fine-grained, species-specific underwater segmentation benchmark tailored to challenging aquatic environments.

Technology Category

Application Category

📝 Abstract
Solving fish segmentation in underwater videos, a real-world problem of great practical value in marine and aquaculture industry, is a challenging task due to the difficulty of the filming environment, poor visibility and limited existing annotated underwater fish data. In order to overcome these obstacles, we introduce a novel two stage unsupervised segmentation approach that requires no human annotations and combines artificially created and real images. Our method generates challenging synthetic training data, by placing virtual fish in real-world underwater habitats, after performing fish transformations such as Thin Plate Spline shape warping and color Histogram Matching, which realistically integrate synthetic fish into the backgrounds, making the generated images increasingly closer to the real world data with every stage of our approach. While we validate our unsupervised method on the popular DeepFish dataset, obtaining a performance close to a fully-supervised SoTA model, we further show its effectiveness on the specific case of salmon segmentation in underwater videos, for which we introduce DeepSalmon, the largest dataset of its kind in the literature (30 GB). Moreover, on both datasets we prove the capability of our approach to boost the performance of the fully-supervised SoTA model.
Problem

Research questions and friction points this paper is trying to address.

Generates realistic synthetic data for unsupervised underwater fish segmentation.
Improves segmentation accuracy without human annotations using synthetic and real images.
Validates method on DeepFish and introduces DeepSalmon dataset for salmon segmentation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage unsupervised segmentation approach
Synthetic data with Thin Plate Spline warping
Histogram Matching for realistic image integration
🔎 Similar Papers
No similar papers found.