Unpaired Image Deraining Using Reward-Guided Self-Reinforcement Strategy

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Unsupervised image deraining remains challenging due to the absence of ground-truth labels and the complex nature of rain streak degradation, which often impedes effective model convergence. This work proposes RGSUD, the first approach to introduce reward-guided self-supervised reinforcement learning into this task. By leveraging an image quality assessment (IQA)-driven dynamic reward mechanism, RGSUD selects high-quality derained outputs to construct pseudo-paired data and integrates them into a self-reinforced training framework, thereby enhancing consistency between derained and clean images. The method significantly improves model convergence and generalization, achieving state-of-the-art unsupervised deraining performance across multiple synthetic and real-world datasets. Both qualitative visual results and quantitative metrics consistently outperform existing methods, and the framework demonstrates strong transferability to other deraining architectures.

📝 Abstract

Unsupervised deraining has attracted attention for its ability to learn the real-world distribution of rain without paired supervision. However, the lack of strong constraints makes it difficult for the network to converge, especially with the complex diversity of rain degradation. A key motivation is that high-quality deraining results occasionally emerge during training, which can be leveraged to guide the optimization process. To overcome these challenges, we introduce RGSUD (Reward-Guided Self-Reinforcement Unsupervised Image Deraining), comprising two key stages: reward recycling and self-reinforcement (SR) training. For the former stage, we propose an Image Quality Assessment (IQA)-based dynamic reward recycling mechanism that selects optimal derained outputs during training and continuously collects high-quality deraining images. In latter stage, we incorporate these rewards into the model's optimization process, constraining the optimization space and improving alignment between derained outputs and clean images. By leveraging IQA-based self-reinforced loss and dynamically updated rewards, we enhance the quality of synthesized pseudo-paired data and stabilize the optimization. Extensive experiments demonstrate that our method achieves SOTA performance across multiple datasets, including paired synthetic, paired real, and unpaired real images, outperforming existing unsupervised deraining approaches in both subjective and objective IQA metrics. Additionally, we show that the self-reinforcement strategy is adaptable to other unsupervised deraining methods and our deraining framework demonstrates strong generalization across existing supervised deraining networks.

Problem

Research questions and friction points this paper is trying to address.

unsupervised deraining

rain degradation

optimization convergence

unpaired image restoration

Innovation

Methods, ideas, or system contributions that make the work stand out.

unsupervised deraining

reward-guided self-reinforcement

image quality assessment (IQA)

pseudo-paired learning