Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation

📅 2026-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the severe long-tailed pixel imbalance in high-resolution remote sensing image semantic segmentation, further exacerbated by inconsistent class distributions between the Urban and Rural domains in the LoveDA dataset, which induces significant model bias. To tackle this challenge, the authors propose a prompt-controlled diffusion augmentation framework that first generates semantic layouts adhering to specified class proportions and domain attributes via a domain-aware masked-ratio conditioned discrete diffusion model, and then synthesizes photorealistic images using a ControlNet-guided Stable Diffusion pipeline. This approach enables, for the first time, explicit joint control over both class distribution and domain characteristics in synthetic samples. Extensive experiments demonstrate consistent and substantial improvements in minority-class performance and cross-domain generalization across multiple segmentation backbones, with particularly notable gains in Urban-to-Rural and Rural-to-Urban transfer scenarios.

Technology Category

Application Category

📝 Abstract
Semantic segmentation of high-resolution remote-sensing imagery is critical for urban mapping and land-cover monitoring, yet training data typically exhibits severe long-tailed pixel imbalance. In the dataset LoveDA, this challenge is compounded by an explicit Urban/Rural split with distinct appearance and inconsistent class-frequency statistics across domains. We present a prompt-controlled diffusion augmentation framework that synthesizes paired label--image samples with explicit control of both domain and semantic composition. Stage~A uses a domain-aware, masked ratio-conditioned discrete diffusion model to generate layouts that satisfy user-specified class-ratio targets while respecting learned co-occurrence structure. Stage~B translates layouts into photorealistic, domain-consistent images using Stable Diffusion with ControlNet guidance. Mixing the resulting ratio and domain-controlled synthetic pairs with real data yields consistent improvements across multiple segmentation backbones, with gains concentrated on minority classes and improved Urban and Rural generalization, demonstrating controllable augmentation as a practical mechanism to mitigate long-tail bias in remote-sensing segmentation. Source codes, pretrained models, and synthetic datasets are available at \href{https://github.com/Buddhi19/SyntheticGen.git}{Github}
Problem

Research questions and friction points this paper is trying to address.

long-tail bias
semantic segmentation
remote-sensing imagery
class imbalance
domain shift
Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt-controlled diffusion
long-tail bias mitigation
semantic segmentation
domain-aware augmentation
ControlNet
🔎 Similar Papers
No similar papers found.
Buddhi Wijenayake
Buddhi Wijenayake
Student at University of Peradeniya
Computer VisionStatistics
N
Nichula Wasalathilake
University of Peradeniya, Peradeniya, Sri Lanka
R
R. Godaliyadda
University of Peradeniya, Peradeniya, Sri Lanka
V
Vijitha R. Herath
University of Peradeniya, Peradeniya, Sri Lanka
P
Parakrama B. Ekanayake
University of Peradeniya, Peradeniya, Sri Lanka
Vishal M. Patel
Vishal M. Patel
Associate Professor, ECE, Johns Hopkins University
Image ProcessingComputer VisionBiometricsMedical Image AnalysisRemote Sensing