The Transparent Earth: A Multimodal Foundation Model for the Earth's Subsurface

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses sparse, multi-source, heterogeneous subsurface geoscience observations (e.g., stress orientation angles, mantle temperatures, tectonic plate types) by proposing the first scalable multimodal foundation model for unified cross-modal modeling and zero-shot subsurface property prediction. Methodologically, it introduces a novel text-embedding-driven modality encoding scheme integrated with positional encoding, enabling dynamic alignment and joint modeling of eight heterogeneous input types—including directional angles, categorical labels, and continuous physical quantities—within a Transformer architecture that employs multi-head attention for modality-agnostic feature fusion. Contributions include: (1) the first multimodal foundation model supporting arbitrary modality combinations and in-context learning; (2) plug-and-play extensibility to novel observation types; and (3) over threefold reduction in prediction error on stress orientation estimation, with consistent performance gains scaling with model size—demonstrating strong generalization and scalability.

Technology Category

Application Category

📝 Abstract
We present the Transparent Earth, a transformer-based architecture for reconstructing subsurface properties from heterogeneous datasets that vary in sparsity, resolution, and modality, where each modality represents a distinct type of observation (e.g., stress angle, mantle temperature, tectonic plate type). The model incorporates positional encodings of observations together with modality encodings, derived from a text embedding model applied to a description of each modality. This design enables the model to scale to an arbitrary number of modalities, making it straightforward to add new ones not considered in the initial design. We currently include eight modalities spanning directional angles, categorical classes, and continuous properties such as temperature and thickness. These capabilities support in-context learning, enabling the model to generate predictions either with no inputs or with an arbitrary number of additional observations from any subset of modalities. On validation data, this reduces errors in predicting stress angle by more than a factor of three. The proposed architecture is scalable and demonstrates improved performance with increased parameters. Together, these advances make the Transparent Earth an initial foundation model for the Earth's subsurface that ultimately aims to predict any subsurface property anywhere on Earth.
Problem

Research questions and friction points this paper is trying to address.

Reconstructing subsurface properties from heterogeneous datasets
Handling varying sparsity, resolution, and modality types
Predicting any subsurface property anywhere on Earth
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based architecture for subsurface reconstruction
Modality encodings from text embeddings for scalability
In-context learning with arbitrary observation inputs
🔎 Similar Papers
No similar papers found.
A
Arnab Mazumder
Energy and Natural Resources Security Group, EES-16: Earth and Environmental Sciences Divison, Los Alamos National Laboratory
Javier E. Santos
Javier E. Santos
Center for Nonlinear Studies, Los Alamos National Lab
porous media
N
Noah Hobbs
Energy and Natural Resources Security Group, EES-16: Earth and Environmental Sciences Divison, Los Alamos National Laboratory
Mohamed Mehana
Mohamed Mehana
Los Alamos National Lab
Daniel O'Malley
Daniel O'Malley
Los Alamos National Laboratory
applied mathematicsmachine learningcomputational sciencequantum computing