SHRUG-FM: Reliability-Aware Foundation Models for Earth Observation

📅 2025-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Geospatial foundation models often exhibit degraded reliability in regions underrepresented in pretraining data—such as low-elevation zones and major river basins. To address this, we propose a reliability-aware prediction framework that innovatively integrates three complementary signals: (1) out-of-distribution (OOD) detection in both input and embedding spaces, (2) task-specific uncertainty quantification, and (3) HydroATLAS-based land-cover attribution analysis. This unified mechanism enables systematic, interpretable diagnosis of model failures with respect to geographic covariates—a first in the literature. Evaluated on burn scar segmentation, our framework effectively identifies performance-degraded regions, filters low-confidence predictions, and significantly enhances model interpretability, robustness, and deployment safety—particularly for climate-sensitive applications.

Technology Category

Application Category

📝 Abstract
Geospatial foundation models for Earth observation often fail to perform reliably in environments underrepresented during pretraining. We introduce SHRUG-FM, a framework for reliability-aware prediction that integrates three complementary signals: out-of-distribution (OOD) detection in the input space, OOD detection in the embedding space and task-specific predictive uncertainty. Applied to burn scar segmentation, SHRUG-FM shows that OOD scores correlate with lower performance in specific environmental conditions, while uncertainty-based flags help discard many poorly performing predictions. Linking these flags to land cover attributes from HydroATLAS shows that failures are not random but concentrated in certain geographies, such as low-elevation zones and large river areas, likely due to underrepresentation in pretraining data. SHRUG-FM provides a pathway toward safer and more interpretable deployment of GFMs in climate-sensitive applications, helping bridge the gap between benchmark performance and real-world reliability.
Problem

Research questions and friction points this paper is trying to address.

Addresses unreliable geospatial foundation models in underrepresented environments
Integrates OOD detection and uncertainty for burn scar segmentation reliability
Identifies systematic failures linked to specific underrepresented geographic features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates OOD detection in input and embedding spaces
Combines task-specific predictive uncertainty for reliability
Links failure flags to land cover attributes
🔎 Similar Papers
No similar papers found.