EarthSynth: Generating Informative Earth Observation with Diffusion Models

📅 2025-05-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Remote sensing image interpretation is severely constrained by the scarcity of annotated data. To address this, we propose EarthSynth—the first diffusion-based foundation model for multi-task remote sensing generation, enabling cross-satellite, multi-class, label-conditioned image synthesis. Methodologically: (1) we establish a novel multi-task generative paradigm tailored to remote sensing; (2) we introduce counterfactual compositional training to enhance class controllability and synthetic data diversity; and (3) we design the R-Filter—a rule-based sample selection mechanism—to retain high-information synthetic images. EarthSynth is pre-trained on EarthSynth-180K, a large-scale, diverse remote sensing dataset. Extensive evaluation demonstrates substantial improvements in few-shot and zero-shot performance across scene classification, object detection, and semantic segmentation. By providing scalable, label-aware synthetic data, EarthSynth establishes a generative data infrastructure for open-world remote sensing interpretation.

Technology Category

Application Category

📝 Abstract
Remote sensing image (RSI) interpretation typically faces challenges due to the scarcity of labeled data, which limits the performance of RSI interpretation tasks. To tackle this challenge, we propose EarthSynth, a diffusion-based generative foundation model that enables synthesizing multi-category, cross-satellite labeled Earth observation for downstream RSI interpretation tasks. To the best of our knowledge, EarthSynth is the first to explore multi-task generation for remote sensing. EarthSynth, trained on the EarthSynth-180K dataset, employs the Counterfactual Composition training strategy to improve training data diversity and enhance category control. Furthermore, a rule-based method of R-Filter is proposed to filter more informative synthetic data for downstream tasks. We evaluate our EarthSynth on scene classification, object detection, and semantic segmentation in open-world scenarios, offering a practical solution for advancing RSI interpretation.
Problem

Research questions and friction points this paper is trying to address.

Addresses scarcity of labeled remote sensing images
Proposes diffusion model for multi-category Earth observation synthesis
Enhances downstream tasks via synthetic data filtering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion-based generative model for Earth observation
Counterfactual Composition training for diversity
Rule-based R-Filter for informative synthetic data
🔎 Similar Papers
No similar papers found.