Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining

๐Ÿ“… 2025-05-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing video deraining methods rely heavily on paired synthetic data, resulting in poor generalization to real-world rainy scenes. To address this, we propose a dual-branch spatiotemporal state-space model that jointly performs spatial feature extraction and inter-frame temporal modeling, augmented by dynamic stacked filters for pixel-wise adaptive feature optimization. We further introduce a semi-supervised median-stacking loss and a sparsity-prior-driven pseudo-label generation strategy. Moreover, we construct RainTrackโ€”the first real-world rainy-video benchmark explicitly designed for object detection and tracking. Our method eliminates dependence on synthetic training data and achieves state-of-the-art performance (superior PSNR/SSIM) on both multi-source synthetic and real-world videos, with efficient inference. Crucially, it significantly enhances the robustness of downstream detection and tracking tasks under rainy conditions.

Technology Category

Application Category

๐Ÿ“ Abstract
Significant progress has been made in video restoration under rainy conditions over the past decade, largely propelled by advancements in deep learning. Nevertheless, existing methods that depend on paired data struggle to generalize effectively to real-world scenarios, primarily due to the disparity between synthetic and authentic rain effects. To address these limitations, we propose a dual-branch spatio-temporal state-space model to enhance rain streak removal in video sequences. Specifically, we design spatial and temporal state-space model layers to extract spatial features and incorporate temporal dependencies across frames, respectively. To improve multi-frame feature fusion, we derive a dynamic stacking filter, which adaptively approximates statistical filters for superior pixel-wise feature refinement. Moreover, we develop a median stacking loss to enable semi-supervised learning by generating pseudo-clean patches based on the sparsity prior of rain. To further explore the capacity of deraining models in supporting other vision-based tasks in rainy environments, we introduce a novel real-world benchmark focused on object detection and tracking in rainy conditions. Our method is extensively evaluated across multiple benchmarks containing numerous synthetic and real-world rainy videos, consistently demonstrating its superiority in quantitative metrics, visual quality, efficiency, and its utility for downstream tasks.
Problem

Research questions and friction points this paper is trying to address.

Enhancing rain streak removal in real-world videos
Addressing synthetic-authentic rain disparity in video restoration
Improving object detection and tracking in rainy conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-branch spatio-temporal state-space model
Dynamic stacking filter for feature refinement
Semi-supervised learning with median stacking loss
๐Ÿ”Ž Similar Papers
No similar papers found.
Shangquan Sun
Shangquan Sun
University of Chinese Academy of Sciences
Computer VisionMachine Learning
W
Wenqi Ren
School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University; MoE Key Laboratory of Information Technology; Guangdong Provincial Key Laboratory of Information Security Technology
J
Juxiang Zhou
Key Laboratory of Educational Information for Nationalities, Yunnan Normal University
S
Shu Wang
School of Mechanical Engineering and Automation, Fuzhou University
J
Jianhou Gan
Key Laboratory of Educational Information for Nationalities, Yunnan Normal University
Xiaochun Cao
Xiaochun Cao
Sun Yat-sen University
Computer VisionArtificial IntelligenceMultimediaMachine Learning