TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

📅 2025-06-25

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This study addresses the need for globally scalable, robust, 10-meter-resolution spatiotemporal embeddings of Earth’s surface—integrating optical and radar observations—to support climate modeling, carbon accounting, and sustainable land-use monitoring. We propose a novel dual-stream Transformer architecture that enables end-to-end joint modeling of Sentinel-2 multispectral and Sentinel-1 SAR polarimetric time series—the first such approach. Leveraging self-supervised learning, optical and SAR features are independently extracted by twin encoders and fused via an MLP to produce unified spatiotemporal embeddings. A precomputed global atlas spanning 2017–2024 is publicly released. Evaluated across five diverse downstream tasks—including land-cover classification, crop-type mapping, and deforestation detection—our model consistently outperforms both task-specific models and state-of-the-art geospatial foundation models, establishing a new benchmark for remote sensing representation learning.

Technology Category

Application Category

📝 Abstract

Satellite remote sensing (RS) enables a wide array of downstream Earth observation (EO) applications, including climate modeling, carbon accounting, and strategies for conservation and sustainable land use. We present TESSERA, a novel Remote Sensing Foundation Model (RSFM) that uses Self-Supervised Learning (SSL) to generate global, robust representations at 10m scale from pixel-level satellite time series data. TESSERA combines information from only optical and SAR data streams using two parallel Transformer-based encoders: one dedicated to Sentinel-1 SAR polarizations and another to Sentinel-2 MSI data (10 selected spectral bands) to create representations that are then fused using a multilayer perceptron (MLP), resulting in a global representation map covering the years 2017 to 2024. Our precomputed representations set a new state-of-the-art performance benchmark and our open-source approach democratizes access to high-performance, high-resolution representations. We benchmark the performance of TESSERA in five diverse tasks, comparing our work with state-of-the-art task-specific models and other foundation models. Our results show that TESSERA outperforms both traditional RS baselines and the leading geospatial foundation models in these diverse downstream tasks.

Problem

Research questions and friction points this paper is trying to address.

Generates global 10m-scale representations from satellite time series

Combines optical and SAR data using Transformer-based encoders

Outperforms traditional and foundation models in diverse tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Supervised Learning for global 10m representations

Dual Transformer encoders for optical and SAR data

Multilayer perceptron fuses Sentinel-1 and Sentinel-2 data

🔎 Similar Papers

No similar papers found.