🤖 AI Summary
This work addresses the limitations of existing single-image dehazing methods, which predominantly rely on RGB inputs alone and struggle to capture the intrinsic relationship between haze distribution and scene depth, leading to poor robustness in complex real-world scenarios. To overcome this, we propose UDPNet, a novel framework that, for the first time, integrates lightweight depth priors from the large-scale pre-trained model DepthAnything V2 into the dehazing pipeline. Our architecture introduces depth-guided channel attention, hierarchical multi-scale depth fusion, and a dual sliding-window multi-head cross-attention mechanism to effectively enhance adaptability to varying haze densities, illumination changes, and domain shifts. Extensive experiments demonstrate significant performance gains, with PSNR improvements of 0.85 dB, 1.19 dB, and 1.79 dB on the SOTS, Haze4K, and NHR benchmarks, respectively, establishing a new state-of-the-art for depth-aware image dehazing.
📝 Abstract
Image dehazing has witnessed significant advancements with the development of deep learning models. However, most existing methods focus solely on single-modal RGB features, neglecting the inherent correlation between scene depth and haze distribution. Even those that jointly optimize depth estimation and image dehazing often suffer from suboptimal performance due to inadequate utilization of accurate depth information. In this paper, we present UDPNet, a general framework that leverages depth-based priors from a large-scale pretrained depth estimation model DepthAnything V2 to boost existing image dehazing models. Specifically, our architecture comprises two key components: the Depth-Guided Attention Module (DGAM) adaptively modulates features via lightweight depth-guided channel attention, and the Depth Prior Fusion Module (DPFM) enables hierarchical fusion of multi-scale depth map features by dual sliding-window multi-head cross-attention mechanism. These modules ensure both computational efficiency and effective integration of depth priors. Moreover, the depth priors empower the network to dynamically adapt to varying haze densities, illumination conditions, and domain gaps across synthetic and real-world data. Extensive experimental results demonstrate the effectiveness of our UDPNet, outperforming the state-of-the-art methods on popular dehazing datasets, with PSNR improvements of 0.85 dB on SOTS-indoor, 1.19 dB on Haze4K, and 1.79 dB on NHR. Our proposed solution establishes a new benchmark for depth-aware dehazing across various scenarios. Pretrained models and codes are released at our project https://github.com/Harbinzzy/UDPNet.