🤖 AI Summary
This study addresses the need for globally scalable, robust, 10-meter-resolution spatiotemporal embeddings of Earth’s surface—integrating optical and radar observations—to support climate modeling, carbon accounting, and sustainable land-use monitoring. We propose a novel dual-stream Transformer architecture that enables end-to-end joint modeling of Sentinel-2 multispectral and Sentinel-1 SAR polarimetric time series—the first such approach. Leveraging self-supervised learning, optical and SAR features are independently extracted by twin encoders and fused via an MLP to produce unified spatiotemporal embeddings. A precomputed global atlas spanning 2017–2024 is publicly released. Evaluated across five diverse downstream tasks—including land-cover classification, crop-type mapping, and deforestation detection—our model consistently outperforms both task-specific models and state-of-the-art geospatial foundation models, establishing a new benchmark for remote sensing representation learning.
📝 Abstract
Satellite remote sensing (RS) enables a wide array of downstream Earth observation (EO) applications, including climate modeling, carbon accounting, and strategies for conservation and sustainable land use. We present TESSERA, a novel Remote Sensing Foundation Model (RSFM) that uses Self-Supervised Learning (SSL) to generate global, robust representations at 10m scale from pixel-level satellite time series data. TESSERA combines information from only optical and SAR data streams using two parallel Transformer-based encoders: one dedicated to Sentinel-1 SAR polarizations and another to Sentinel-2 MSI data (10 selected spectral bands) to create representations that are then fused using a multilayer perceptron (MLP), resulting in a global representation map covering the years 2017 to 2024. Our precomputed representations set a new state-of-the-art performance benchmark and our open-source approach democratizes access to high-performance, high-resolution representations. We benchmark the performance of TESSERA in five diverse tasks, comparing our work with state-of-the-art task-specific models and other foundation models. Our results show that TESSERA outperforms both traditional RS baselines and the leading geospatial foundation models in these diverse downstream tasks.