🤖 AI Summary
Very-high-resolution (VHR, 0.5 m) SAR imagery suffers from poor interpretability due to severe speckle noise and complex scattering mechanisms; existing GAN-based SAR-to-optical translation methods face challenges including training instability, low generation fidelity, and reliance on low-resolution paired data.
Method: This work introduces, for the first time, a conditional Brownian Bridge Diffusion Model (BBDM) for VHR SAR-to-optical image translation. By modeling more robust forward and reverse path priors, BBDM enhances structural consistency and texture realism. It is trained end-to-end on the high-resolution, multi-scale aligned SAR–optical (MSAW) dataset, entirely eschewing GAN architectures.
Contribution/Results: Extensive experiments demonstrate that our method consistently outperforms state-of-the-art conditional diffusion models and diverse GAN baselines across key perceptual quality metrics—including LPIPS, FID, and NIQE—establishing a novel paradigm for cross-modal translation of VHR remote sensing imagery.
📝 Abstract
Synthetic Aperture Radar (SAR) imaging technology provides the unique advantage of being able to collect data regardless of weather conditions and time. However, SAR images exhibit complex backscatter patterns and speckle noise, which necessitate expertise for interpretation. Research on translating SAR images into optical-like representations has been conducted to aid the interpretation of SAR data. Nevertheless, existing studies have predominantly utilized low-resolution satellite imagery datasets and have largely been based on Generative Adversarial Network (GAN) which are known for their training instability and low fidelity. To overcome these limitations of low-resolution data usage and GAN-based approaches, this letter introduces a conditional image-to-image translation approach based on Brownian Bridge Diffusion Model (BBDM). We conducted comprehensive experiments on the MSAW dataset, a paired SAR and optical images collection of 0.5m Very-High-Resolution (VHR). The experimental results indicate that our method surpasses both the Conditional Diffusion Models (CDMs) and the GAN-based models in diverse perceptual quality metrics.