Effective Damage Data Generation by Fusing Imagery with Human Knowledge Using Vision-Language Models

📅 2025-08-02

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

In humanitarian assistance and disaster response (HADR), infrastructure damage assessment—particularly for buildings and roads—faces three key challenges: severe class imbalance, scarcity of moderately damaged samples, and substantial noise in pixel-level manual annotations. To address these, this work pioneers the integration of vision-language models (VLMs) into disaster damage data generation. By jointly leveraging remote sensing imagery and human semantic priors, our approach enables semantic-guided, fine-grained, and diverse synthetic damage image generation—effectively mitigating annotation noise and augmenting hard-to-classify samples. Extensive experiments demonstrate that the synthesized data significantly enhances the generalization capability of deep learning models on multi-level damage classification tasks. Notably, our method achieves state-of-the-art (SOTA) performance in fine-grained identification across multiple infrastructure types, including buildings and roads.

Technology Category

Application Category

📝 Abstract

It is of crucial importance to assess damages promptly and accurately in humanitarian assistance and disaster response (HADR). Current deep learning approaches struggle to generalize effectively due to the imbalance of data classes, scarcity of moderate damage examples, and human inaccuracy in pixel labeling during HADR situations. To accommodate for these limitations and exploit state-of-the-art techniques in vision-language models (VLMs) to fuse imagery with human knowledge understanding, there is an opportunity to generate a diversified set of image-based damage data effectively. Our initial experimental results suggest encouraging data generation quality, which demonstrates an improvement in classifying scenes with different levels of structural damage to buildings, roads, and infrastructures.

Problem

Research questions and friction points this paper is trying to address.

Assess damages accurately in HADR scenarios

Address data imbalance and labeling inaccuracies in damage assessment

Generate diverse damage data using vision-language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses imagery with human knowledge using VLMs

Generates diversified image-based damage data

Improves classification of structural damage levels

🔎 Similar Papers

Urban Safety Perception Assessments via Integrating Multimodal Large Language Models with Street View Images