Fourier-RWKV: A Multi-State Perception Network for Efficient Image Dehazing

๐Ÿ“… 2025-12-08
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Non-uniform haze in real-world scenarios causes severe image degradation, yet existing Transformer-based methods suffer from quadratic computational complexity, hindering real-time deployment. To address this, we propose a multi-state-aware linear-complexity dehazing network, introducing a novel paradigm that jointly models spatial, frequency-domain, and semantic information. Specifically, we extend the RWKV attention mechanism to the Fourier domain, design a deformable four-directional token shifting module for local adaptivity, incorporate a Fourier mixing block to capture long-range frequency-domain dependencies, and employ dynamic semantic kernels for cross-domain feature alignment. Our method achieves state-of-the-art performance across multiple benchmarks, reduces computational cost by 47โ€“63%, and attains 52 FPS inference speed on 1080p imagesโ€”marking the first end-to-end real-time dehazing framework that maintains high restoration fidelity.

Technology Category

Application Category

๐Ÿ“ Abstract
Image dehazing is crucial for reliable visual perception, yet it remains highly challenging under real-world non-uniform haze conditions. Although Transformer-based methods excel at capturing global context, their quadratic computational complexity hinders real-time deployment. To address this, we propose Fourier Receptance Weighted Key Value (Fourier-RWKV), a novel dehazing framework based on a Multi-State Perception paradigm. The model achieves comprehensive haze degradation modeling with linear complexity by synergistically integrating three distinct perceptual states: (1) Spatial-form Perception, realized through the Deformable Quad-directional Token Shift (DQ-Shift) operation, which dynamically adjusts receptive fields to accommodate local haze variations; (2) Frequency-domain Perception, implemented within the Fourier Mix block, which extends the core WKV attention mechanism of RWKV from the spatial domain to the Fourier domain, preserving the long-range dependencies essential for global haze estimation while mitigating spatial attenuation; (3) Semantic-relation Perception, facilitated by the Semantic Bridge Module (SBM), which utilizes Dynamic Semantic Kernel Fusion (DSK-Fusion) to precisely align encoder-decoder features and suppress artifacts. Extensive experiments on multiple benchmarks demonstrate that Fourier-RWKV delivers state-of-the-art performance across diverse haze scenarios while significantly reducing computational overhead, establishing a favorable trade-off between restoration quality and practical efficiency. Code is available at: https://github.com/Dilizlr/Fourier-RWKV.
Problem

Research questions and friction points this paper is trying to address.

Efficiently removes non-uniform haze from images
Reduces computational complexity for real-time deployment
Integrates spatial, frequency, and semantic perception for restoration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Linear complexity via Multi-State Perception paradigm
Frequency-domain attention extends RWKV to Fourier domain
Dynamic Semantic Kernel Fusion aligns encoder-decoder features
๐Ÿ”Ž Similar Papers
No similar papers found.
L
Lirong Zheng
Institute of Intelligent Information Processing, Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, China
Y
Yanshan Li
Institute of Intelligent Information Processing, Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, China
R
Rui Yu
Institute of Intelligent Information Processing, Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, China
Kaihao Zhang
Kaihao Zhang
Australian National University
Deep learningComputer vision