๐ค AI Summary
This study addresses the limited native support for hyperspectral imaging in existing geospatial foundation models, which struggle to effectively handle its high-dimensional spectral characteristics. For the first time, it evaluates the adaptability of the general-purpose geospatial foundation model TerraMind to hyperspectral downstream tasks without any hyperspectral pretraining. To bridge this gap, the work proposes a physics-aware channel adaptation method that combines naive band selection with a spectral response function (SRF)-based channel grouping strategy. Experimental results demonstrate that this approach enables TerraMind to effectively adapt to hyperspectral tasks with only moderate performance degradation. The findings also confirm that models with native hyperspectral support retain a clear advantage, offering valuable insights for the future design of spectral tokenization in geospatial foundation models.
๐ Abstract
Geospatial Foundation Models (GFMs) typically lack native support for Hyperspectral Imaging (HSI) due to the complexity and sheer size of high-dimensional spectral data. This study investigates the adaptability of TerraMind, a multimodal GFM, to address HSI downstream tasks \emph{without} HSI-specific pretraining. Therefore, we implement and compare two channel adaptation strategies: Naive Band Selection and physics-aware Spectral Response Function (SRF) grouping. Overall, our results indicate a general superiority of deep learning models with native support of HSI data. Our experiments also demonstrate the ability of TerraMind to adapt to HSI downstream tasks through band selection with moderate performance decline. Therefore, the findings of this research establish a critical baseline for HSI integration, motivating the need for native spectral tokenization in future multimodal model architectures.