Knowledge-Guided Masked Autoencoder with Linear Spectral Mixing and Spectral-Angle-Aware Reconstruction

📅 2025-12-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Hyperspectral image self-supervised reconstruction suffers from poor representation interpretability, weak generalization, and low data efficiency. Method: This paper embeds physics-informed priors—specifically the Linear Spectral Mixture Model (LSMM) and Spectral Angle Mapper (SAM)—into a Vision Transformer-based Masked Autoencoder (ViT-MAE) framework. It is the first to jointly formulate LSMM constraints and SAM-based geometric metrics as reconstruction objectives within MAE, enabling end-to-end co-optimization of data-driven learning and physical modeling. Huber loss is employed to jointly optimize reconstruction fidelity and spectral geometric consistency. Contribution/Results: Under limited labeling, the method improves downstream classification and unmixing performance by over 8%, enhances training stability, and yields latent representations that strictly adhere to linear mixing physics. This significantly boosts few-shot robustness and physical interpretability.

Technology Category

Application Category

📝 Abstract
Integrating domain knowledge into deep learning has emerged as a promising direction for improving model interpretability, generalization, and data efficiency. In this work, we present a novel knowledge-guided ViT-based Masked Autoencoder that embeds scientific domain knowledge within the self-supervised reconstruction process. Instead of relying solely on data-driven optimization, our proposed approach incorporates the Linear Spectral Mixing Model (LSMM) as a physical constraint and physically-based Spectral Angle Mapper (SAM), ensuring that learned representations adhere to known structural relationships between observed signals and their latent components. The framework jointly optimizes LSMM and SAM loss with a conventional Huber loss objective, promoting both numerical accuracy and geometric consistency in the feature space. This knowledge-guided design enhances reconstruction fidelity, stabilizes training under limited supervision, and yields interpretable latent representations grounded in physical principles. The experimental findings indicate that the proposed model substantially enhances reconstruction quality and improves downstream task performance, highlighting the promise of embedding physics-informed inductive biases within transformer-based self-supervised learning.
Problem

Research questions and friction points this paper is trying to address.

Integrates domain knowledge into self-supervised learning for improved interpretability
Incorporates physical constraints to enhance reconstruction fidelity and training stability
Improves downstream task performance using physics-informed inductive biases in transformers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge-guided ViT-based Masked Autoencoder with self-supervised reconstruction
Incorporates Linear Spectral Mixing Model as physical constraint
Uses Spectral Angle Mapper for geometric consistency in feature space
🔎 Similar Papers
No similar papers found.