Effective Damage Data Generation by Fusing Imagery with Human Knowledge Using Vision-Language Models

📅 2025-08-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In humanitarian assistance and disaster response (HADR), infrastructure damage assessment—particularly for buildings and roads—faces three key challenges: severe class imbalance, scarcity of moderately damaged samples, and substantial noise in pixel-level manual annotations. To address these, this work pioneers the integration of vision-language models (VLMs) into disaster damage data generation. By jointly leveraging remote sensing imagery and human semantic priors, our approach enables semantic-guided, fine-grained, and diverse synthetic damage image generation—effectively mitigating annotation noise and augmenting hard-to-classify samples. Extensive experiments demonstrate that the synthesized data significantly enhances the generalization capability of deep learning models on multi-level damage classification tasks. Notably, our method achieves state-of-the-art (SOTA) performance in fine-grained identification across multiple infrastructure types, including buildings and roads.

Technology Category

Application Category

📝 Abstract
It is of crucial importance to assess damages promptly and accurately in humanitarian assistance and disaster response (HADR). Current deep learning approaches struggle to generalize effectively due to the imbalance of data classes, scarcity of moderate damage examples, and human inaccuracy in pixel labeling during HADR situations. To accommodate for these limitations and exploit state-of-the-art techniques in vision-language models (VLMs) to fuse imagery with human knowledge understanding, there is an opportunity to generate a diversified set of image-based damage data effectively. Our initial experimental results suggest encouraging data generation quality, which demonstrates an improvement in classifying scenes with different levels of structural damage to buildings, roads, and infrastructures.
Problem

Research questions and friction points this paper is trying to address.

Assess damages accurately in HADR scenarios
Address data imbalance and labeling inaccuracies in damage assessment
Generate diverse damage data using vision-language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses imagery with human knowledge using VLMs
Generates diversified image-based damage data
Improves classification of structural damage levels
🔎 Similar Papers
No similar papers found.
J
Jie Wei
City College of New York, New York, NY, USA
E
Erika Ardiles-Cruz
Air Force Research Lab, Rome, NY, USA
A
Aleksey Panasyuk
Air Force Research Lab, Rome, NY, USA
Erik Blasch
Erik Blasch
Air Force Research Laboratory
Information FusionTarget TrackingImage FusionAvionicsHuman Factors