UniV2D: Bridging Visual Restoration and Semantic Perception for Underwater Salient Object Detection

๐Ÿ“… 2026-05-07
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

210K/year
๐Ÿค– AI Summary
This work addresses the challenge of underwater salient object detection, which suffers from severe visual degradation due to wavelength-selective absorption and scattering. Conventional two-stage โ€œenhance-then-detectโ€ approaches often introduce semantic inconsistencies between restoration and detection phases. To overcome this limitation, we propose UniV2D, a unified network that abandons the sequential paradigm in favor of a novel semantics-driven joint learning framework, enabling bidirectional synergy between image restoration and saliency detection: high-level semantic cues guide visual restoration, while the restored output reciprocally enhances perceptual performance. The architecture features a hierarchical dual-branch design incorporating a self-calibrated decoder, a mask-aware restoration module, and a cross-level modulation-based saliency-guided refinement module. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art approaches across multiple benchmarks, achieving new state-of-the-art results in both quantitative metrics and qualitative assessments.
๐Ÿ“ Abstract
Underwater salient object detection (USOD) plays a vital role in marine vision tasks but remains fundamentally challenging due to severe visual degradation, such as selective absorption and medium scattering. Conventional pipelines typically adopt a sequential "enhance-then-detect" paradigm. However, isolating low-level visual restoration from high-level semantic perception often leads to semantic inconsistency, where the restored images may not be optimal for detection and can even introduce task-irrelevant noise. To break this sequential bottleneck, we propose UniV2D, a Unified Vision-to-Detection Network that jointly optimizes visual restoration and salient object detection within a mutually beneficial framework. Unlike traditional methods that rely on disjointed pipelines or rigid physical priors, UniV2D introduces a semantic-driven learning paradigm: high-level saliency semantics actively guide the restoration process, while the restored visual cues reciprocally enhance saliency perception. Specifically, UniV2D features a hierarchical dual-branch architecture. It first employs a self-calibrated decoder to predict initial saliency masks alongside a mask-aware restoration module to reconstruct image content. Subsequently, a saliency-guided refinement module equipped with cross-level modulation is utilized to align structural fidelity with semantic consistency. Extensive experiments across multiple benchmarks demonstrate that UniV2D significantly outperforms state-of-the-art methods in both quantitative and qualitative evaluations, establishing a new standard for joint underwater perception.
Problem

Research questions and friction points this paper is trying to address.

underwater salient object detection
visual degradation
semantic inconsistency
enhance-then-detect paradigm
Innovation

Methods, ideas, or system contributions that make the work stand out.

underwater salient object detection
joint optimization
semantic-driven restoration
unified vision-to-detection
cross-level modulation
๐Ÿ”Ž Similar Papers
No similar papers found.