🤖 AI Summary
Remote sensing image cloud removal faces challenges where diffusion models neglect semantic information in cloudy images, leading to significant information loss. To address this, we propose EMRDM—the first modular mean-regression diffusion model specifically designed for cloud removal—establishing a direct diffusion mapping from cloudy to cloud-free images, departing from the conventional “noise-to-image” paradigm. Methodologically, EMRDM reformulates both the forward diffusion process and the ODE-based reverse sampling; introduces a preconditioned U-Net denoiser; incorporates a hybrid deterministic-stochastic dual-mode sampling strategy; and integrates a multi-temporal joint denoising network. Extensive experiments demonstrate that EMRDM achieves substantial improvements over state-of-the-art methods on both single- and multi-temporal remote sensing benchmarks. The source code is publicly available.
📝 Abstract
Cloud removal (CR) remains a challenging task in remote sensing image processing. Although diffusion models (DM) exhibit strong generative capabilities, their direct applications to CR are suboptimal, as they generate cloudless images from random noise, ignoring inherent information in cloudy inputs. To overcome this drawback, we develop a new CR model EMRDM based on mean-reverting diffusion models (MRDMs) to establish a direct diffusion process between cloudy and cloudless images. Compared to current MRDMs, EMRDM offers a modular framework with updatable modules and an elucidated design space, based on a reformulated forward process and a new ordinary differential equation (ODE)-based backward process. Leveraging our framework, we redesign key MRDM modules to boost CR performance, including restructuring the denoiser via a preconditioning technique, reorganizing the training process, and improving the sampling process by introducing deterministic and stochastic samplers. To achieve multi-temporal CR, we further develop a denoising network for simultaneously denoising sequential images. Experiments on mono-temporal and multi-temporal datasets demonstrate the superior performance of EMRDM. Our code is available at https://github.com/Ly403/EMRDM.