🤖 AI Summary
This work addresses the challenges of residual artifacts and insufficient detail recovery in image shadow removal by proposing a three-stage progressive deshadowing method. The approach models the task as an iterative refinement process, integrating RGB appearance cues with frozen DINOv2 semantic features and geometric information from monocular depth and surface normals. Multi-modal features are reused across stages to progressively eliminate artifacts. Built upon the OmniSR architecture, the method introduces a contraction constraint loss to ensure non-increasing cascade reconstruction error and employs stage-wise training combined with cosine-annealed checkpoint ensembling for stable optimization. The proposed solution achieves state-of-the-art performance on the NTIRE 2026 WSRD+ hidden test set with PSNR 26.680, SSIM 0.8740, LPIPS 0.0578, and FID 26.135, and demonstrates strong generalization on ISTD+ and UAV-SC+.
📝 Abstract
We present a three-stage progressive shadow-removal pipeline for the CVPR2026 NTIRE WSRD+ challenge. Built on OmniSR, our method treats deshadowing as iterative direct refinement, where later stages correct residual artefacts left by earlier predictions. The model combines RGB appearance with frozen DINOv2 semantic guidance and geometric cues from monocular depth and surface normals, reused across all stages. To stabilise multi-stage optimisation, we introduce a contraction-constrained objective that encourages non-increasing reconstruction error across the cascade. A staged training pipeline transfers from earlier WSRD pretraining to WSRD+ supervision and final WSRD+ 2026 adaptation with cosine-annealed checkpoint ensembling. On the official WSRD+ 2026 hidden test set, our final ensemble achieved 26.680 PSNR, 0.8740 SSIM, 0.0578 LPIPS, and 26.135 FID, ranked first overall, and won the NTIRE 2026 Image Shadow Removal Challenge. The strong performance of the proposed model is further validated on the ISTD+ and UAV-SC+ datasets.