Multi-Modal Building Change Detection for Large-Scale Small Changes: Benchmark and Baseline

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This study addresses the challenge of building change detection in optical remote sensing imagery, where variations in illumination, seasonal conditions, and surface materials often hinder accurate identification of subtle structural changes using RGB data alone. To tackle this issue, the authors introduce LSMD, the first large-scale, high-resolution, and precisely co-registered multimodal bitemporal benchmark dataset for building change detection. They further propose the Multimodal Spectral Complementary Network (MSCNet), which leverages neighborhood context enhancement, cross-modal alignment interaction, and saliency-aware multisource optimization to fully exploit the heterogeneous complementarity between RGB and near-infrared modalities. Experimental results demonstrate that MSCNet significantly outperforms existing methods on LSMD, achieving superior accuracy and robustness in fine-grained building change detection under complex real-world scenarios.

Technology Category

Application Category

📝 Abstract

Change detection in optical remote sensing imagery is susceptible to illumination fluctuations, seasonal changes, and variations in surface land-cover materials. Relying solely on RGB imagery often produces pseudo-changes and leads to semantic ambiguity in features. Incorporating near-infrared (NIR) information provides heterogeneous physical cues that are complementary to visible light, thereby enhancing the discriminability of building materials and tiny structures while improving detection accuracy. However, existing multi-modal datasets generally lack high-resolution and accurately registered bi-temporal imagery, and current methods often fail to fully exploit the inherent heterogeneity between these modalities. To address these issues, we introduce the Large-scale Small-change Multi-modal Dataset (LSMD), a bi-temporal RGB-NIR building change detection benchmark dataset targeting small changes in realistic scenarios, providing a rigorous testing platform for evaluating multi-modal change detection methods in complex environments. Based on LSMD, we further propose the Multi-modal Spectral Complementarity Network (MSCNet) to achieve effective cross-modal feature fusion. MSCNet comprises three key components: the Neighborhood Context Enhancement Module (NCEM) to strengthen local spatial details, the Cross-modal Alignment and Interaction Module (CAIM) to enable deep interaction between RGB and NIR features, and the Saliency-aware Multisource Refinement Module (SMRM) to progressively refine fused features. Extensive experiments demonstrate that MSCNet effectively leverages multi-modal information and consistently outperforms existing methods under multiple input configurations, validating its efficacy for fine-grained building change detection. The source code will be made publicly available at: https://github.com/AeroVILab-AHU/LSMD

Problem

Research questions and friction points this paper is trying to address.

change detection

multi-modal

building

small changes

remote sensing

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-modal change detection

RGB-NIR fusion

building change detection