Data Augmentation and Resolution Enhancement using GANs and Diffusion Models for Tree Segmentation

📅 2025-05-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of low crown segmentation accuracy and scarce annotated data in urban remote sensing using low-resolution imagery, this paper proposes a domain-adaptive data augmentation framework integrating generative adversarial networks (GANs) and diffusion models. Specifically, we synergistically combine Real-ESRGAN—preserving structural fidelity during super-resolution—with Latent/Stable Diffusion—to generate semantically consistent synthetic training samples—establishing an end-to-end, weakly supervised augmentation paradigm that significantly reduces reliance on manual annotations. Coupled with a U-Net-based segmentation architecture and domain-adaptive training, our method achieves over 50% improvement in Intersection-over-Union (IoU) for tree crown segmentation across heterogeneous sensor platforms and flight altitudes. This substantially enhances model generalizability and robustness, offering a novel, cost-effective, and high-accuracy paradigm for dynamic urban forest monitoring.

Technology Category

Application Category

📝 Abstract
Urban forests play a key role in enhancing environmental quality and supporting biodiversity in cities. Mapping and monitoring these green spaces are crucial for urban planning and conservation, yet accurately detecting trees is challenging due to complex landscapes and the variability in image resolution caused by different satellite sensors or UAV flight altitudes. While deep learning architectures have shown promise in addressing these challenges, their effectiveness remains strongly dependent on the availability of large and manually labeled datasets, which are often expensive and difficult to obtain in sufficient quantity. In this work, we propose a novel pipeline that integrates domain adaptation with GANs and Diffusion models to enhance the quality of low-resolution aerial images. Our proposed pipeline enhances low-resolution imagery while preserving semantic content, enabling effective tree segmentation without requiring large volumes of manually annotated data. Leveraging models such as pix2pix, Real-ESRGAN, Latent Diffusion, and Stable Diffusion, we generate realistic and structurally consistent synthetic samples that expand the training dataset and unify scale across domains. This approach not only improves the robustness of segmentation models across different acquisition conditions but also provides a scalable and replicable solution for remote sensing scenarios with scarce annotation resources. Experimental results demonstrated an improvement of over 50% in IoU for low-resolution images, highlighting the effectiveness of our method compared to traditional pipelines.
Problem

Research questions and friction points this paper is trying to address.

Enhancing low-resolution aerial images for tree segmentation
Reducing dependency on large manually labeled datasets
Improving segmentation robustness across varying acquisition conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhance low-resolution images using GANs and Diffusion
Generate synthetic samples to expand training dataset
Improve tree segmentation without manual annotations
🔎 Similar Papers
No similar papers found.
A
Alessandro dos Santos Ferreira
Federal University of Mato Grosso do Sul, Campo Grande, MS, Brazil
Ana Paula Marques Ramos
Ana Paula Marques Ramos
São Paulo State University (Unesp)
Remote Sensing of VegetationMachine learning (shallow and deep learning)Spatial analyst
J
José Marcato Junior
Federal University of Mato Grosso do Sul, Campo Grande, MS, Brazil
Wesley Nunes Gonçalves
Wesley Nunes Gonçalves
Federal University of Mato Grosso do Sul
Computer VisionPattern Recognition