D2-CDIG: Controlled Diffusion Remote Sensing Image Generation with Dual Priors of DEM and Cloud-Fog

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
Existing controllable remote sensing image generation methods rely on segmentation or edge priors, which struggle to realistically model complex terrain and atmospheric phenomena, often yielding results lacking fine details and photorealism. To address this limitation, this work proposes the D2-CDIG framework, which introduces digital elevation models (DEMs) and cloud/fog information as dual priors into a diffusion model for the first time. The framework employs a dual-branch architecture to decouple surface and atmospheric generation pathways and incorporates a hierarchical control signal injection mechanism along with an adjustable cloud/fog slider, enabling precise manipulation of terrain structure and cloud distribution. This approach substantially enhances the realism, textural richness, and controllability of generated images, offering high-quality synthetic data to support large-scale remote sensing foundation models and downstream applications.
📝 Abstract
Remote sensing image generation provides a reliable data foundation for remote sensing large models and downstream tasks. However, existing controllable remote sensing image generation methods typically rely on traditional techniques such as segmentation and edge detection, which do not fully leverage terrain or atmospheric conditions. As a result, the generated images often lack accuracy and naturalness when dealing with complex terrains and atmospheric phenomena. In this paper, we propose a novel remote sensing image generation framework, D2-CDIG, which integrates diffusion models with a dual-prior control mechanism. By incorporating both Digital Elevation Model (DEM) and cloud-fog information as dual prior knowledge, D2-CDIG precisely controls ground features and atmospheric phenomena within the generated images. Specifically, D2-CDIG decouples the terrain and atmospheric generation processes through independent control of ground and atmospheric branches. Additionally, a refined cloud-fog slider is introduced to flexibly adjust cloud thickness and distribution. During training, ground and atmospheric control signals are injected in layers to ensure a seamless transition within the images. Compared to traditional methods based on segmentation or edge detection, D2-CDIG shows significant improvements in image quality, detail richness, and realism. D2-CDIG offers a flexible and precise solution for remote sensing image generation, providing high-quality data for training large remote sensing models and downstream tasks.
Problem

Research questions and friction points this paper is trying to address.

remote sensing image generation
controllable generation
terrain modeling
atmospheric phenomena
image realism
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion model
dual-prior control
Digital Elevation Model (DEM)
cloud-fog modeling
controllable image generation
Z
Zuopeng Zhao
School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou 221116, China
Y
Ying Liu
School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou
K
Kanyaphakphachsorn Pharksuwan
School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou
S
Su Luo
School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou
Xiaoyu Li
Xiaoyu Li
Hong Kong University of Science and Technology
Deep LearningComputer GraphicsComputational Photography
M
Maocai Ning
School of Computer Science and Technology/School of Artificial Intelligence, China University of Mining and Technology, Xuzhou