🤖 AI Summary
Image dehazing faces three key challenges: significant spatial variation in haze distribution, poor generalization of existing methods, and a fundamental trade-off between accuracy and efficiency. To address these, this paper proposes a plug-and-play RGB-D fusion module. We systematically discover— for the first time—that depth features extracted from large-scale pre-trained depth estimation models exhibit strong consistency across multi-level haze conditions. Leveraging this inherent stability, we design a lightweight, architecture-agnostic feature fusion mechanism. The module integrates seamlessly into diverse mainstream dehazing networks without increasing inference overhead, thereby enhancing robustness and cross-scenario generalization. Extensive experiments on standard benchmarks—including SOTS and RESIDE—demonstrate substantial improvements in PSNR and SSIM, while maintaining real-time inference efficiency. Our approach thus bridges the gap between high-fidelity restoration and practical deployment requirements.
📝 Abstract
Image dehazing remains a challenging problem due to the spatially varying nature of haze in real-world scenes. While existing methods have demonstrated the promise of large-scale pretrained models for image dehazing, their architecture-specific designs hinder adaptability across diverse scenarios with different accuracy and efficiency requirements. In this work, we systematically investigate the generalization capability of pretrained depth representations-learned from millions of diverse images-for image dehazing. Our empirical analysis reveals that the learned deep depth features maintain remarkable consistency across varying haze levels. Building on this insight, we propose a plug-and-play RGB-D fusion module that seamlessly integrates with diverse dehazing architectures. Extensive experiments across multiple benchmarks validate both the effectiveness and broad applicability of our approach.