StefaLand: An Efficient Geoscience Foundation Model That Improves Dynamic Land-Surface Predictions

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Conventional models for climate-driven land surface dynamics prediction suffer from poor spatial generalization and degraded performance in data-scarce regions, while existing vision foundation models incur prohibitive computational costs and lack explicit mechanisms to model spatiotemporal geophysical processes. Method: We propose EarthSurface-FM—the first geoscience-oriented foundation model for land surface modeling—integrating a masked autoencoder backbone, attribute-aware representation learning, location-aware architecture, and residual fine-tuning adapters to explicitly couple static geographic priors with multi-source temporal observations. Contribution/Results: Evaluated on four benchmark datasets (runoff, soil moisture, soil composition, etc.), EarthSurface-FM achieves significant improvements over state-of-the-art methods, especially in data-sparse regions. Crucially, it enables full pretraining and downstream adaptation using only academic-scale compute resources, overcoming key applicability bottlenecks of generic vision models in land surface dynamics modeling.

Technology Category

Application Category

📝 Abstract

Stewarding natural resources, mitigating floods, droughts, wildfires, and landslides, and meeting growing demands require models that can predict climate-driven land-surface responses and human feedback with high accuracy. Traditional impact models, whether process-based, statistical, or machine learning, struggle with spatial generalization due to limited observations and concept drift. Recently proposed vision foundation models trained on satellite imagery demand massive compute and are ill-suited for dynamic land-surface prediction. We introduce StefaLand, a generative spatiotemporal earth foundation model centered on landscape interactions. StefaLand improves predictions on three tasks and four datasets: streamflow, soil moisture, and soil composition, compared to prior state-of-the-art. Results highlight its ability to generalize across diverse, data-scarce regions and support broad land-surface applications. The model builds on a masked autoencoder backbone that learns deep joint representations of landscape attributes, with a location-aware architecture fusing static and time-series inputs, attribute-based representations that drastically reduce compute, and residual fine-tuning adapters that enhance transfer. While inspired by prior methods, their alignment with geoscience and integration in one model enables robust performance on dynamic land-surface tasks. StefaLand can be pretrained and finetuned on academic compute yet outperforms state-of-the-art baselines and even fine-tuned vision foundation models. To our knowledge, this is the first geoscience land-surface foundation model that demonstrably improves dynamic land-surface interaction predictions and supports diverse downstream applications.

Problem

Research questions and friction points this paper is trying to address.

Improving dynamic land-surface predictions for streamflow, soil moisture, and composition

Addressing spatial generalization challenges in traditional geoscience models with limited data

Developing computationally efficient foundation models for diverse land-surface applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked autoencoder backbone learns joint landscape representations

Location-aware architecture fuses static and time-series inputs

Attribute-based representations and residual adapters reduce compute

🔎 Similar Papers

Tackling water table depth modeling via machine learning: From proxy observations to verifiability