Is Self-Supervised Pre-training on Satellite Imagery Better than ImageNet? A Systematic Study with Sentinel-2

📅 2025-02-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

162K/year
🤖 AI Summary
It remains unclear whether domain-specific self-supervised learning (SSL) pretraining substantially outperforms ImageNet pretraining in remote sensing. Method: We introduce GeoNet—a large-scale, Sentinel-2–based remote sensing dataset—and conduct the first systematic, apples-to-apples comparison of SwAV and MAE under a unified evaluation framework, assessing their transferability across six few-shot downstream tasks in both remote sensing and natural imagery domains. Contribution/Results: Domain-aligned SSL pretraining yields only marginal gains over ImageNet initialization; performance is largely comparable across most tasks. The substantial data curation and computational overhead of SSL are thus not justified by measurable improvements. This work challenges the prevailing assumption that domain alignment inherently leads to superior performance, providing an empirical benchmark and prompting a cost-benefit reassessment of pretraining strategies for remote sensing models.

Technology Category

Application Category

📝 Abstract
Self-supervised learning (SSL) has demonstrated significant potential in pre-training robust models with limited labeled data, making it particularly valuable for remote sensing (RS) tasks. A common assumption is that pre-training on domain-aligned data provides maximal benefits on downstream tasks, particularly when compared to ImageNet-pretraining (INP). In this work, we investigate this assumption by collecting GeoNet, a large and diverse dataset of global optical Sentinel-2 imagery, and pre-training SwAV and MAE on both GeoNet and ImageNet. Evaluating these models on six downstream tasks in the few-shot setting reveals that SSL pre-training on RS data offers modest performance improvements over INP, and that it remains competitive in multiple scenarios. This indicates that the presumed benefits of SSL pre-training on RS data may be overstated, and the additional costs of data curation and pre-training could be unjustified.
Problem

Research questions and friction points this paper is trying to address.

Self-supervised learning on satellite imagery
Comparison with ImageNet pre-training
Performance in remote sensing tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised learning pre-training
Sentinel-2 imagery dataset
SwAV and MAE models
🔎 Similar Papers
No similar papers found.