π€ AI Summary
This work addresses the challenges posed by the heterogeneity of multimodal geoscientific data and the limitations in model reliability and sustainability, which hinder end-to-end intelligent applications from perception to scientific discovery. The authors propose a dual-dimensional βdepthβbreadthβ analytical framework: depth spans from perception and multimodal reasoning to scientific agent workflows, while breadth encompasses the atmosphere, hydrosphere, lithosphere, and their coupled processes. By systematically reviewing the evolution of Earth foundation models and integrating techniques for multimodal fusion, foundation model architectures, and heterogeneous observational data integration, the study compiles over 200 datasets and benchmarks. It pioneers a shift toward embodied, actionable AI that functions as an Earth scientist and charts a path for trustworthy, integrated, and sustainable AI-driven Earth science.
π Abstract
Large foundation models (FMs) are transforming Earth science by integrating heterogeneous multimodal data, such as multi-platform imagery, gridded reanalysis data, diverse geophysical and geochemical observations, and domain-specific text, to support tasks ranging from basic perception to advanced scientific discovery. This paper provides a unified review of Earth science foundation models (Earth FMs) through two complementary dimensions: depth, which traces the evolution of model capabilities from perception to multimodal reasoning and agentic scientific workflows, and breadth, which summarizes their expanding applications across the atmosphere, hydrosphere, lithosphere, biosphere, anthroposphere, and cryosphere, as well as coupled Earth system processes. Using this framework, we review representative multimodal Earth foundation models and compile more than 200 datasets and benchmarks spanning diverse Earth science tasks and modalities. We further discuss key challenges in multimodal data heterogeneity, scientific reliability and continual updating, scalability and sustainability, and the transition from foundation models to agentic and embodied Earth intelligence, and outline future directions toward more integrated, trustworthy, and actionable AI Earth scientists. Overall, this paper offers a structured roadmap for understanding the development of Earth foundation models from both capability depth and application breadth.