🤖 AI Summary
Conventional hyperspectral remote sensing models require per-image fine-tuning due to variable spectral band counts, incurring substantial computational and annotation overhead. Method: We propose the first tuning-free hyperspectral foundation model, featuring a channel-adaptive weight dictionary and a semantic-distance-driven multi-mask prompting mechanism, integrated with full-spectrum dynamic embedding (0.4–2.5 μm) and learnable visual prompt engineering for single-prompt zero-shot transfer. Contribution/Results: The model operates without any labeled data or image-level fine-tuning, enabling cross-band and cross-task generalization. Evaluated on five downstream tasks across eleven hyperspectral datasets, it achieves performance comparable to 5-shot fine-tuned task-specific models using only one generic prompt—reducing hardware and time costs significantly. To our knowledge, this is the first practically viable zero-shot foundation model paradigm for hyperspectral remote sensing.
📝 Abstract
Advanced interpretation of hyperspectral remote sensing images benefits many precise Earth observation tasks. Recently, visual foundation models have promoted the remote sensing interpretation but concentrating on RGB and multispectral images. Due to the varied hyperspectral channels,existing foundation models would face image-by-image tuning situation, imposing great pressure on hardware and time resources. In this paper, we propose a tuning-free hyperspectral foundation model called HyperFree, by adapting the existing visual prompt engineering. To process varied channel numbers, we design a learned weight dictionary covering full-spectrum from $0.4 sim 2.5 , mu ext{m}$, supporting to build the embedding layer dynamically. To make the prompt design more tractable, HyperFree can generate multiple semantic-aware masks for one prompt by treating feature distance as semantic-similarity. After pre-training HyperFree on constructed large-scale high-resolution hyperspectral images, HyperFree (1 prompt) has shown comparable results with specialized models (5 shots) on 5 tasks and 11 datasets.Code and dataset are accessible at https://rsidea.whu.edu.cn/hyperfree.htm.