🤖 AI Summary
Addressing severe degradation—including geometric distortion, spectral deficiency, and misregistration—as well as the lack of pixel-level annotations in mid-20th-century historical satellite imagery (e.g., Keyhole), this work introduces Urban1960SatBench, the first benchmark dataset for urban remote sensing from the 1960s. We further propose Urban1960SatUSM, an unsupervised semantic segmentation framework tailored to such challenging imagery. Its core innovations comprise: (i) a self-supervised architecture for joint geometric-spectral modeling of historical images; (ii) a confidence-aware spatial alignment mechanism to mitigate registration errors; and (iii) a focal confidence loss function designed to refine noisy pseudo-labels. Evaluated on the Urban1960SatSeg test set, our method achieves over 12% average accuracy improvement over state-of-the-art unsupervised approaches. This work establishes the first reproducible and robust technical foundation for quantitative visual analysis of long-term urban evolution using archival satellite data.
📝 Abstract
Historical satellite imagery, such as mid-20$^{th}$ century Keyhole data, offers rare insights into understanding early urban development and long-term transformation. However, severe quality degradation (e.g., distortion, misalignment, and spectral scarcity) and annotation absence have long hindered semantic segmentation on such historical RS imagery. To bridge this gap and enhance understanding of urban development, we introduce $ extbf{Urban1960SatBench}$, an annotated segmentation dataset based on historical satellite imagery with the earliest observation time among all existing segmentation datasets, along with a benchmark framework for unsupervised segmentation tasks, $ extbf{Urban1960SatUSM}$. First, $ extbf{Urban1960SatBench}$ serves as a novel, expertly annotated semantic segmentation dataset built on mid-20$^{th}$ century Keyhole imagery, covering 1,240 km$^2$ and key urban classes (buildings, roads, farmland, water). As the earliest segmentation dataset of its kind, it provides a pioneering benchmark for historical urban understanding. Second, $ extbf{Urban1960SatUSM}$(Unsupervised Segmentation Model) is a novel unsupervised semantic segmentation framework for historical RS imagery. It employs a confidence-aware alignment mechanism and focal-confidence loss based on a self-supervised learning architecture, which generates robust pseudo-labels and adaptively prioritizes prediction difficulty and label reliability to improve unsupervised segmentation on noisy historical data without manual supervision. Experiments show Urban1960SatUSM significantly outperforms existing unsupervised segmentation methods on Urban1960SatSeg for segmenting historical urban scenes, promising in paving the way for quantitative studies of long-term urban change using modern computer vision. Our benchmark and supplementary material are available at https://github.com/Tianxiang-Hao/Urban1960SatSeg.