🤖 AI Summary
SAR land-cover classification suffers from heavy reliance on labeled data and poor generalization. To address this, we propose a self-supervised contrastive learning framework for SAR foundation modeling. Our method introduces two key innovations: (1) a dynamic instance contrastive module that models global contextual relationships to enhance semantic consistency; and (2) a contour consistency constraint module that integrates shallow-layer geometric priors to improve structural awareness. The model is pre-trained end-to-end on the large-scale SARSense dataset without human annotations. Extensive experiments demonstrate that our foundation model significantly outperforms state-of-the-art methods on downstream tasks—including land-cover mapping, water-body detection, and road extraction—achieving superior cross-task transferability and robustness under low-data and unsupervised settings. This work establishes a general-purpose foundation model paradigm for few-shot and unsupervised SAR interpretation.
📝 Abstract
Although significant advances have been achieved in SAR land-cover classification, recent methods remain predominantly focused on supervised learning, which relies heavily on extensive labeled datasets. This dependency not only limits scalability and generalization but also restricts adaptability to diverse application scenarios. In this paper, a general-purpose foundation model for SAR land-cover classification is developed, serving as a robust cornerstone to accelerate the development and deployment of various downstream models. Specifically, a Dynamic Instance and Contour Consistency Contrastive Learning (DI3CL) pre-training framework is presented, which incorporates a Dynamic Instance (DI) module and a Contour Consistency (CC) module. DI module enhances global contextual awareness by enforcing local consistency across different views of the same region. CC module leverages shallow feature maps to guide the model to focus on the geometric contours of SAR land-cover objects, thereby improving structural discrimination. Additionally, to enhance robustness and generalization during pre-training, a large-scale and diverse dataset named SARSense, comprising 460,532 SAR images, is constructed to enable the model to capture comprehensive and representative features. To evaluate the generalization capability of our foundation model, we conducted extensive experiments across a variety of SAR land-cover classification tasks, including SAR land-cover mapping, water body detection, and road extraction. The results consistently demonstrate that the proposed DI3CL outperforms existing methods. Our code and pre-trained weights are publicly available at: https://github.com/SARpre-train/DI3CL.