UniV2D: Bridging Visual Restoration and Semantic Perception for Underwater Salient Object Detection

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the challenge of underwater salient object detection, which suffers from severe visual degradation due to wavelength-selective absorption and scattering. Conventional two-stage “enhance-then-detect” approaches often introduce semantic inconsistencies between restoration and detection phases. To overcome this limitation, we propose UniV2D, a unified network that abandons the sequential paradigm in favor of a novel semantics-driven joint learning framework, enabling bidirectional synergy between image restoration and saliency detection: high-level semantic cues guide visual restoration, while the restored output reciprocally enhances perceptual performance. The architecture features a hierarchical dual-branch design incorporating a self-calibrated decoder, a mask-aware restoration module, and a cross-level modulation-based saliency-guided refinement module. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art approaches across multiple benchmarks, achieving new state-of-the-art results in both quantitative metrics and qualitative assessments.

📝 Abstract

Underwater salient object detection (USOD) plays a vital role in marine vision tasks but remains fundamentally challenging due to severe visual degradation, such as selective absorption and medium scattering. Conventional pipelines typically adopt a sequential "enhance-then-detect" paradigm. However, isolating low-level visual restoration from high-level semantic perception often leads to semantic inconsistency, where the restored images may not be optimal for detection and can even introduce task-irrelevant noise. To break this sequential bottleneck, we propose UniV2D, a Unified Vision-to-Detection Network that jointly optimizes visual restoration and salient object detection within a mutually beneficial framework. Unlike traditional methods that rely on disjointed pipelines or rigid physical priors, UniV2D introduces a semantic-driven learning paradigm: high-level saliency semantics actively guide the restoration process, while the restored visual cues reciprocally enhance saliency perception. Specifically, UniV2D features a hierarchical dual-branch architecture. It first employs a self-calibrated decoder to predict initial saliency masks alongside a mask-aware restoration module to reconstruct image content. Subsequently, a saliency-guided refinement module equipped with cross-level modulation is utilized to align structural fidelity with semantic consistency. Extensive experiments across multiple benchmarks demonstrate that UniV2D significantly outperforms state-of-the-art methods in both quantitative and qualitative evaluations, establishing a new standard for joint underwater perception.

Problem

Research questions and friction points this paper is trying to address.

underwater salient object detection

visual degradation

semantic inconsistency

enhance-then-detect paradigm

Innovation

Methods, ideas, or system contributions that make the work stand out.

underwater salient object detection

joint optimization

semantic-driven restoration

unified vision-to-detection

cross-level modulation

🔎 Similar Papers

No similar papers found.