๐ค AI Summary
Existing remote sensing methods suffer from insufficient exploitation of temporal details in time-series modeling and struggle to jointly optimize change detection and land-cover classification. To address these limitations, this paper proposes the first spatiotemporal cooperative continuous change detection framework designed for โฅ3-phase satellite imagery. The method introduces two novel components: (1) a self-attention-driven Temporal Feature Refinement (TFR) module to enhance fine-grained temporal dynamics, and (2) a Markov Network-based Multi-Task Integration (MTI) module that unifies building-level pairwise change identification and semantic segmentation, producing temporally consistent change maps. The architecture integrates Convolutional Neural Networks (ConvNets), self-attention mechanisms, Markov Random Fields (MRFs), and multi-task learning, and is specifically tailored for high-resolution optical imagery from PlanetScope and Gaofen-2. Experimental results demonstrate state-of-the-art performance, achieving F1-scores of 0.551 on PlanetScope and 0.440 on Gaofen-2โsignificantly outperforming prevailing bi- and multi-temporal approaches.
๐ Abstract
Urbanization advances at unprecedented rates, resulting in negative effects on the environment and human well-being. Remote sensing has the potential to mitigate these effects by supporting sustainable development strategies with accurate information on urban growth. Deep learning-based methods have achieved promising urban change detection results from optical satellite image pairs using convolutional neural networks (ConvNets), transformers, and a multi-task learning setup. However, transformers have not been leveraged for urban change detection with multi-temporal data, i.e.,>2 images, and multi-task learning methods lack integration approaches that combine change and segmentation outputs. To fill this research gap, we propose a continuous urban change detection method that identifies changes in each consecutive image pair of a satellite image time series (SITS). Specifically, we propose a temporal feature refinement (TFR) module that utilizes self-attention to improve ConvNet-based multi-temporal building representations. Furthermore, we propose a multi-task integration (MTI) module that utilizes Markov networks to find an optimal building map time series based on segmentation and dense change outputs. The proposed method effectively identifies urban changes based on high-resolution SITS acquired by the PlanetScope constellation (F1 score 0.551) and Gaofen-2 (F1 score 0.440). Moreover, our experiments on two challenging datasets demonstrate the effectiveness of the proposed method compared to bi-temporal and multi-temporal urban change detection and segmentation methods.