Foundation Model-Driven Semantic Change Detection in Remote Sensing Imagery

📅 2026-02-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing remote sensing semantic change detection methods, which suffer from insufficient semantic understanding and redundant paradigms. To overcome these challenges, we propose an efficient framework built upon the remote sensing foundation model PerA, featuring a modular cascaded gated decoder (CG-Decoder) that streamlines the detection pipeline and enhances multi-scale feature interaction. Additionally, we introduce a soft semantic consistency loss (SSCLoss) to improve training stability and cross-encoder adaptability. The proposed approach substantially simplifies conventional architectures while achieving state-of-the-art performance on two public remote sensing datasets, demonstrating its effectiveness, generalizability, and compatibility with diverse visual encoders.

Technology Category

Application Category

📝 Abstract
Remote sensing (RS) change detection methods can extract critical information on surface dynamics and are an essential means for humans to understand changes in the earth's surface and environment. Among these methods, semantic change detection (SCD) can more effectively interpret the multi-class information contained in bi-temporal RS imagery, providing semantic-level predictions that support dynamic change monitoring. However, due to the limited semantic understanding capability of the model and the inherent complexity of the SCD tasks, existing SCD methods face significant challenges in both performance and paradigm complexity. In this paper, we propose PerASCD, a SCD method driven by RS foundation model PerA, designed to enhance the multi-scale semantic understanding and overall performance. We introduce a modular Cascaded Gated Decoder (CG-Decoder) that simplifies complex SCD decoding pipelines while promoting effective multi-level feature interaction and fusion. In addition, we propose a Soft Semantic Consistency Loss (SSCLoss) to mitigate the numerical instability commonly encountered during SCD training. We further explore the applicability of multiple existing RS foundation models on the SCD task when equipped with the proposed decoder. Experimental results demonstrate that our decoder not only effectively simplifies the paradigm of SCD, but also achieves seamless adaptation across various vision encoders. Our method achieves state-of-the-art (SOTA) performance on two public benchmark datasets, validating its effectiveness. The code is available at https://github.com/SathShen/PerASCD.git.
Problem

Research questions and friction points this paper is trying to address.

Semantic Change Detection
Remote Sensing Imagery
Foundation Model
Multi-scale Semantic Understanding
Change Detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic Change Detection
Foundation Model
Cascaded Gated Decoder
Soft Semantic Consistency Loss
Remote Sensing Imagery
H
Hengtong Shen
School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China
L
Li Yan
School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China
Hong Xie
Hong Xie
University of Science and Technology of China (USTC)
Data Science/MiningOnline Learning
Y
Yaxuan Wei
School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China
Xinhao Li
Xinhao Li
Nanjing University
Video UnderstandingMultimodal LLMVision-Language Learning
W
Wenfei Shen
School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China
P
Peixian Lv
School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China
Fei Tan
Fei Tan
Associate Professor, East China Normal University
NLPData MiningNetwork Science