🤖 AI Summary
Real-world underwater images suffer from multiple concurrent degradations—including color cast, haze, blur, and low contrast—yet existing methods typically model only one or two degradations, limiting generalization. To address this, we propose the first end-to-end, scene-agnostic underwater image restoration framework. Our approach features: (1) a novel Mamba-driven Mixture-of-Experts (MoE) module that explicitly decouples and jointly mitigates heterogeneous degradations; (2) a spatial-frequency dual-domain prior generator for task-adaptive prompt selection; and (3) integration of large-model pre-trained depth priors to enable region-wise depth-aware feature modulation. Extensive experiments demonstrate state-of-the-art performance across multiple benchmarks, with average PSNR and SSIM gains of 2.1 dB and 0.035, respectively. Qualitative results show more natural visual restoration, and the framework exhibits strong robustness to unseen degradation types.
📝 Abstract
Existing underwater image restoration (UIR) methods generally only handle color distortion or jointly address color and haze issues, but they often overlook the more complex degradations that can occur in underwater scenes. To address this limitation, we propose a Universal Underwater Image Restoration method, termed as UniUIR, considering the complex scenario of real-world underwater mixed distortions as an all-in-one manner. To decouple degradation-specific issues and explore the inter-correlations among various degradations in UIR task, we designed the Mamba Mixture-of-Experts module. This module enables each expert to identify distinct types of degradation and collaboratively extract task-specific priors while maintaining global feature representation based on linear complexity. Building upon this foundation, to enhance degradation representation and address the task conflicts that arise when handling multiple types of degradation, we introduce the spatial-frequency prior generator. This module extracts degradation prior information in both spatial and frequency domains, and adaptively selects the most appropriate task-specific prompts based on image content, thereby improving the accuracy of image restoration. Finally, to more effectively address complex, region-dependent distortions in UIR task, we incorporate depth information derived from a large-scale pre-trained depth prediction model, thereby enabling the network to perceive and leverage depth variations across different image regions to handle localized degradation. Extensive experiments demonstrate that UniUIR can produce more attractive results across qualitative and quantitative comparisons, and shows strong generalization than state-of-the-art methods.