🤖 AI Summary
Image deraining is critical for autonomous driving and related vision tasks, yet existing single-scale methods struggle to simultaneously recover fine-grained details and preserve global structural consistency. To address this, we propose PRISM, a progressive deraining framework comprising three stages: coarse extraction, hybrid-domain feature fusion, and fine-grained restoration. We introduce Hybrid Attention U-Net—integrating channel-wise attention and windowed Transformer modules—to enable robust multi-scale feature aggregation. Additionally, we design Hybrid Domain Mamba (HDMamba), jointly modeling spatial semantics and wavelet-domain characteristics for enhanced rain pattern discrimination. An original-resolution subnetwork is further embedded to retain high-frequency textures and sharp edges. Extensive experiments on multiple benchmarks demonstrate that PRISM achieves state-of-the-art performance, significantly improving removal of rain streaks and raindrops while maintaining global coherence and substantially enhancing texture fidelity and edge sharpness.
📝 Abstract
Image deraining is an essential vision technique that removes rain streaks and water droplets, enhancing clarity for critical vision tasks like autonomous driving. However, current single-scale models struggle with fine-grained recovery and global consistency. To address this challenge, we propose Progressive Rain removal with Integrated State-space Modeling (PRISM), a progressive three-stage framework: Coarse Extraction Network (CENet), Frequency Fusion Network (SFNet), and Refine Network (RNet). Specifically, CENet and SFNet utilize a novel Hybrid Attention UNet (HA-UNet) for multi-scale feature aggregation by combining channel attention with windowed spatial transformers. Moreover, we propose Hybrid Domain Mamba (HDMamba) for SFNet to jointly model spatial semantics and wavelet domain characteristics. Finally, RNet recovers the fine-grained structures via an original-resolution subnetwork. Our model learns high-frequency rain characteristics while preserving structural details and maintaining global context, leading to improved image quality. Our method achieves competitive results on multiple datasets against recent deraining methods.