SpectralEarth: Training Hyperspectral Foundation Models at Scale

📅 2024-08-15
🏛️ IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
📈 Citations: 18
Influential: 1
📄 PDF
🤖 AI Summary
To address the critical bottleneck of lacking globally representative, multi-temporal, large-scale benchmark datasets for hyperspectral imagery—hindering foundational model development—this work introduces SpectralEarth, the first large-scale multi-temporal hyperspectral pretraining dataset (415K locations, 538K image patches). We present the first scalable self-supervised pretraining framework for hyperspectral foundation models, leveraging MAE and SimCLR, and propose a novel spectral adapter architecture that embeds spectral-specific inductive biases into vision backbones. Furthermore, we establish the first unified benchmark for hyperspectral downstream tasks, comprising nine diverse datasets. Our method enables joint spectral-spatial modeling and cross-sensor generalization via fine-tuning. Experiments demonstrate that the pretrained models significantly outperform supervised baselines on land-cover, crop, and tree-species classification; achieve 40% higher fine-tuning efficiency; and exhibit strong robustness to cross-sensor domain shifts.

Technology Category

Application Category

📝 Abstract
Foundation models have triggered a paradigm shift in computer vision and are increasingly being adopted in remote sensing, particularly for multispectral imagery. Yet, their potential in hyperspectral imaging (HSI) remains untapped due to the absence of comprehensive and globally representative hyperspectral datasets. To close this gap, we introduce SpectralEarth, a large-scale multitemporal dataset designed to pretrain hyperspectral foundation models leveraging data from the environmental mapping and analysis program (EnMAP). SpectralEarth comprises 538 974 image patches covering 415 153 unique locations from 11 636 globally distributed EnMAP scenes spanning two years of archive. In addition, 17.5% of these locations include multiple timestamps, enabling multitemporal HSI analysis. Utilizing state-of-the-art self-supervised learning algorithms, we pretrain a series of foundation models on SpectralEarth, integrating a spectral adapter into classical vision backbones to accommodate the unique characteristics of HSI. In tandem, we construct nine downstream datasets for land-cover, crop-type mapping, and tree-species classification, providing benchmarks for model evaluation. Experimental results support the versatility of our models and their generalizability across different tasks and sensors. We also highlight computational efficiency during model fine-tuning.
Problem

Research questions and friction points this paper is trying to address.

Lack of comprehensive hyperspectral datasets for foundation models
Need for multitemporal hyperspectral imaging analysis capabilities
Challenges in adapting vision backbones for hyperspectral data characteristics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale hyperspectral dataset SpectralEarth introduced
Self-supervised learning for hyperspectral foundation models
Spectral adapter integrated into vision backbones
🔎 Similar Papers
No similar papers found.
N
Nassim Ait Ali Braham
Data Science in Earth Observation, Technical University of Munich, Germany
C
C. Albrecht
Remote Sensing Technology Institute, German Aerospace Center, Germany
Julien Mairal
Julien Mairal
Inria - Univ. Grenoble Alpes
machine learningartificial intelligenceoptimizationcomputer visionimage processing
J
J. Chanussot
Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, 38000 Grenoble, France
Y
Yi Wang
Data Science in Earth Observation, Technical University of Munich, Germany
Xiao Xiang Zhu
Xiao Xiang Zhu
Technical University of Munich
Earth ObservationAI4EOSignal ProcessingData ScienceSynthetic Aperture Radar