🤖 AI Summary
Addressing the challenges of high intra-class similarity, ambiguous boundaries, and prohibitive annotation costs in building damage assessment within conflict zones, this paper introduces Change Semantic Detection (CSD)—a novel task requiring only bitemporal remote sensing imagery and semantic labels for changed regions, eliminating the need for full-scene pixel-level annotations and thereby drastically reducing labeling overhead. Methodologically, we propose a Multi-scale Cross-attention Difference Siamese Network (MC-DiSNet), which integrates DINOv3 pre-trained features with self-supervised fine-grained change modeling. Evaluated on Gaza-Change and SECOND datasets, our approach achieves significant accuracy improvements—particularly for small-area and weak-boundary damage instances—demonstrating robustness under severe visual ambiguity. This work establishes a practical, scalable paradigm for rapid and precise post-conflict damage assessment.
📝 Abstract
Accurately and swiftly assessing damage from conflicts is crucial for humanitarian aid and regional stability. In conflict zones, damaged zones often share similar architectural styles, with damage typically covering small areas and exhibiting blurred boundaries. These characteristics lead to limited data, annotation difficulties, and significant recognition challenges, including high intra-class similarity and ambiguous semantic changes. To address these issues, we introduce a pre-trained DINOv3 model and propose a multi-scale cross-attention difference siamese network (MC-DiSNet). The powerful visual representation capability of the DINOv3 backbone enables robust and rich feature extraction from bi-temporal remote sensing images. We also release a new Gaza-change dataset containing high-resolution satellite image pairs from 2023-2024 with pixel-level semantic change annotations. It is worth emphasizing that our annotations only include semantic pixels of changed areas. Unlike conventional semantic change detection (SCD), our approach eliminates the need for large-scale semantic annotations of bi-temporal images, instead focusing directly on the changed regions. We term this new task change semantic detection (CSD). The CSD task represents a direct extension of binary change detection (BCD). Due to the limited spatial extent of semantic regions, it presents greater challenges than traditional SCD tasks. We evaluated our method under the CSD framework on both the Gaza-Change and SECOND datasets. Experimental results demonstrate that our proposed approach effectively addresses the CSD task, and its outstanding performance paves the way for practical applications in rapid damage assessment across conflict zones.