๐ค AI Summary
This work addresses the challenge of hyperspectral image semantic segmentation in proximal sensing scenarios, where limited annotated data hinders conventional within-domain training approaches. It presents the first systematic investigation into the transferability of hyperspectral foundation models across remote sensing and proximal sensing domains, proposing a direct transfer strategy that obviates the need for cross-modal alignment. This approach effectively preserves spectral information while simplifying model architecture. Experimental results on the HS3-Bench benchmark demonstrate that, under scarce annotation conditions, the proposed method significantly outperforms standard within-domain training and substantially narrows the performance gap with more complex cross-modal techniques, all while maintaining high segmentation accuracy.
๐ Abstract
Hyperspectral imaging (HSI) semantic segmentation typically relies on in-domain training, but limited data availability often restricts model performance in real-world applications. Current approaches to leverage foundation models in proximal sensing use cross-modality techniques, bridging RGB and HSI to exploit vision foundation models. However, these methods either discard spectral information or introduce architectural complexity. We propose cross-domain transfer as an alternative, reusing HSI foundation models - originally trained in remote sensing - for proximal sensing applications. By eliminating the need to bridge modality gaps, our approach preserves spectral information while maintaining a simple architecture. Using the HS3-Bench benchmark, we systematically evaluate and compare conventional in-domain, in-modality training, cross-modality transfer and cross-domain transfer strategies. Our results demonstrate that cross-domain transfer achieves large performance improvements over in-domain, in-modality training, reduces the performance gap to cross-modality approaches and maintains strong performance in limited data settings. Thus, this work advances more effective HSI semantic segmentation in diverse applications.