🤖 AI Summary
To address the high computational complexity and lack of physics-guided priors in vision transformer-based image dehazing, this paper proposes a prior-driven lightweight dehazing network. The method innovatively integrates three physical priors—bright channel, dark channel, and histogram equalization—for the first time, and designs a Prior Aggregation Module with gated attention alongside a spatial-channel协同 feature harmonization mechanism. Embedded within a U-Net architecture, it enables context-adaptive feature selection and dual-dimensional low-frequency suppression. Evaluated on standard benchmarks including SOTS and RESIDE, the approach achieves state-of-the-art quantitative performance while improving inference speed by 2.3× and reducing model parameters by 38%. It significantly enhances texture recovery and edge sharpness, striking an effective balance between efficiency and visual quality.
📝 Abstract
Image dehazing is a crucial task that involves the enhancement of degraded images to recover their sharpness and textures. While vision Transformers have exhibited impressive results in diverse dehazing tasks, their quadratic complexity and lack of dehazing priors pose significant drawbacks for real-world applications. In this paper, guided by triple priors, Bright Channel Prior (BCP), Dark Channel Prior (DCP), and Histogram Equalization (HE), we propose a extit{P}rior- extit{g}uided Hierarchical extit{H}armonization Network (PGH$^2$Net) for image dehazing. PGH$^2$Net is built upon the UNet-like architecture with an efficient encoder and decoder, consisting of two module types: (1) Prior aggregation module that injects B/DCP and selects diverse contexts with gating attention. (2) Feature harmonization modules that subtract low-frequency components from spatial and channel aspects and learn more informative feature distributions to equalize the feature maps.