๐ค AI Summary
Non-uniform haze in real-world scenarios causes severe image degradation, yet existing Transformer-based methods suffer from quadratic computational complexity, hindering real-time deployment. To address this, we propose a multi-state-aware linear-complexity dehazing network, introducing a novel paradigm that jointly models spatial, frequency-domain, and semantic information. Specifically, we extend the RWKV attention mechanism to the Fourier domain, design a deformable four-directional token shifting module for local adaptivity, incorporate a Fourier mixing block to capture long-range frequency-domain dependencies, and employ dynamic semantic kernels for cross-domain feature alignment. Our method achieves state-of-the-art performance across multiple benchmarks, reduces computational cost by 47โ63%, and attains 52 FPS inference speed on 1080p imagesโmarking the first end-to-end real-time dehazing framework that maintains high restoration fidelity.
๐ Abstract
Image dehazing is crucial for reliable visual perception, yet it remains highly challenging under real-world non-uniform haze conditions. Although Transformer-based methods excel at capturing global context, their quadratic computational complexity hinders real-time deployment. To address this, we propose Fourier Receptance Weighted Key Value (Fourier-RWKV), a novel dehazing framework based on a Multi-State Perception paradigm. The model achieves comprehensive haze degradation modeling with linear complexity by synergistically integrating three distinct perceptual states: (1) Spatial-form Perception, realized through the Deformable Quad-directional Token Shift (DQ-Shift) operation, which dynamically adjusts receptive fields to accommodate local haze variations; (2) Frequency-domain Perception, implemented within the Fourier Mix block, which extends the core WKV attention mechanism of RWKV from the spatial domain to the Fourier domain, preserving the long-range dependencies essential for global haze estimation while mitigating spatial attenuation; (3) Semantic-relation Perception, facilitated by the Semantic Bridge Module (SBM), which utilizes Dynamic Semantic Kernel Fusion (DSK-Fusion) to precisely align encoder-decoder features and suppress artifacts. Extensive experiments on multiple benchmarks demonstrate that Fourier-RWKV delivers state-of-the-art performance across diverse haze scenarios while significantly reducing computational overhead, establishing a favorable trade-off between restoration quality and practical efficiency. Code is available at: https://github.com/Dilizlr/Fourier-RWKV.