TaCo: Capturing Spatio-Temporal Semantic Consistency in Remote Sensing Change Detection

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Remote sensing change detection (RSCD) faces a key challenge under mask supervision: while spatial localization is accurate, semantic transition modeling between bi-temporal images remains inadequate due to temporal semantic inconsistency. To address this, we propose TaCo, the first network that integrates a text-guided transition generator with joint spatiotemporal semantic constraints to explicitly model semantic transformation relationships between bi-temporal features. Our method fuses textual semantics with bi-temporal visual representations and introduces dual feedback reconstruction and transitional discrimination constraints—enhancing semantic consistency without increasing inference overhead. Evaluated on six public benchmarks covering both binary and semantic change detection tasks, TaCo achieves state-of-the-art performance across all datasets, significantly outperforming existing methods.

Technology Category

Application Category

📝 Abstract
Remote sensing change detection (RSCD) aims to identify surface changes across bi-temporal satellite images. Most previous methods rely solely on mask supervision, which effectively guides spatial localization but provides limited constraints on the temporal semantic transitions. Consequently, they often produce spatially coherent predictions while still suffering from unresolved semantic inconsistencies. To address this limitation, we propose TaCo, a spatio-temporal semantic consistent network, which enriches the existing mask-supervised framework with a spatio-temporal semantic joint constraint. TaCo conceptualizes change as a semantic transition between bi-temporal states, in which one temporal feature representation can be derived from the other via dedicated transition features. To realize this, we introduce a Text-guided Transition Generator that integrates textual semantics with bi-temporal visual features to construct the cross-temporal transition features. In addition, we propose a spatio-temporal semantic joint constraint consisting of bi-temporal reconstruct constraints and a transition constraint: the former enforces alignment between reconstructed and original features, while the latter enhances discrimination for changes. This design can yield substantial performance gains without introducing any additional computational overhead during inference. Extensive experiments on six public datasets, spanning both binary and semantic change detection tasks, demonstrate that TaCo consistently achieves SOTA performance.
Problem

Research questions and friction points this paper is trying to address.

Addresses semantic inconsistency in remote sensing change detection
Enhances temporal semantic transitions beyond spatial localization
Improves discrimination for changes across bi-temporal satellite images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces spatio-temporal semantic joint constraint
Uses text-guided transition generator for features
Enhances discrimination without inference overhead
🔎 Similar Papers
No similar papers found.
H
Han Guo
Beihang University
C
Chenyang Liu
Beihang University
H
Haotian Zhang
Beihang University
B
Bowen Chen
Beihang University
Zhengxia Zou
Zhengxia Zou
Beihang Univeristy
computer visionimage processingremote sensinggames
Z
Zhenwei Shi
Beihang University