Generating Satellite Imagery Data for Wildfire Detection through Mask-Conditioned Generative AI

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the scarcity of labeled satellite imagery for wildfire monitoring by introducing, for the first time, the mask-conditional diffusion model EarthSynth to synthesize post-disaster Sentinel-2 RGB images. Leveraging an image inpainting architecture combined with structured prompting strategies—including handwritten cues and vision-language prompts generated by Qwen2-VL—the method enables controllable generation without task-specific fine-tuning. A region-aware color-matching post-processing step further enhances photorealism. Experimental results demonstrate that the inpainting paradigm significantly outperforms full-image generation across four key metrics: Burn IoU (peaking at 0.456), color distance, darkness contrast, and spectral plausibility, offering an effective data augmentation solution for wildfire detection.
📝 Abstract
The scarcity of labeled satellite imagery remains a fundamental bottleneck for deep-learning (DL)-based wildfire monitoring systems. This paper investigates whether a diffusion-based foundation model for Earth Observation (EO), EarthSynth, can synthesize realistic post-wildfire Sentinel-2 RGB imagery conditioned on existing burn masks, without task-specific retraining. Using burn masks derived from the CalFireSeg-50 dataset (Martin et al., 2025), we design and evaluate six controlled experimental configurations that systematically vary: (i) pipeline architecture (mask-only full generation vs. inpainting with pre-fire context), (ii) prompt engineering strategy (three hand-crafted prompts and a VLM-generated prompt via Qwen2-VL), and (iii) a region-wise color-matching post-processing step. Quantitative assessment on 10 stratified test samples uses four complementary metrics: Burn IoU, burn-region color distance (ΔC_burn), Darkness Contrast, and Spectral Plausibility. Results show that inpainting-based pipelines consistently outperform full-tile generation across all metrics, with the structured inpainting prompt achieving the best spatial alignment (Burn IoU = 0.456) and burn saliency (Darkness Contrast = 20.44), while color matching produces the lowest color distance (ΔC_burn = 63.22) at the cost of reduced burn saliency. VLM-assisted inpainting is competitive with hand-crafted prompts. These findings provide a foundation for incorporating generative data augmentation into wildfire detection pipelines. Code and experiments are available at: https://www.kaggle.com/code/valeriamartinh/genai-all-runned
Problem

Research questions and friction points this paper is trying to address.

wildfire detection
satellite imagery
data scarcity
labeled data
Earth Observation
Innovation

Methods, ideas, or system contributions that make the work stand out.

mask-conditioned generation
diffusion model
wildfire detection
satellite imagery synthesis
inpainting
🔎 Similar Papers
No similar papers found.
V
Valeria Martin
Department of Intelligent Systems and Robotics, University of West Florida, Pensacola, FL, USA
K. Brent Venable
K. Brent Venable
Professor of Computer Science, IHMC and UWF
Artificial IntelligencePreferencesPlanning and SchedulingComputational Social Choice
D
Derek Morgan
Department of Intelligent Systems and Robotics, University of West Florida, Pensacola, FL, USA