Recommendations for Comprehensive and Independent Evaluation of Machine Learning-Based Earth System Models

📅 2024-10-24
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Machine learning–driven Earth system models (ML-ESMs) face a critical challenge in independently verifying their physical credibility under future coupled regimes lacking historical observational constraints. Method: We propose the first systematic five-dimensional evaluation framework—assessing physical consistency, counterfactual robustness, multi-scale interpretability, cross-task generalizability, and independent third-party validation—integrating physics-constrained diagnostics, counterfactual sensitivity analysis, multi-source observational synergy, eXplainable AI (XAI), and standardized benchmarking protocols. Contribution/Results: This work delivers the first international ML-ESM comprehensive evaluation guideline, formally adopted by the Coupled Model Intercomparison Project (CMIP) and the AI4Earth community as a core model certification standard. It has directly enabled three major ML-ESM initiatives to implement independent, rigorous assessment pipelines, thereby advancing beyond conventional weather-forecasting model evaluation paradigms.

Technology Category

Application Category

📝 Abstract
Machine learning (ML) is a revolutionary technology with demonstrable applications across multiple disciplines. Within the Earth science community, ML has been most visible for weather forecasting, producing forecasts that rival modern physics-based models. Given the importance of deepening our understanding and improving predictions of the Earth system on all time scales, efforts are now underway to develop forecasting models into Earth-system models (ESMs), capable of representing all components of the coupled Earth system (or their aggregated behavior) and their response to external changes. Modeling the Earth system is a much more difficult problem than weather forecasting, not least because the model must represent the alternate (e.g., future) coupled states of the system for which there are no historical observations. Given that the physical principles that enable predictions about the response of the Earth system are often not explicitly coded in these ML-based models, demonstrating the credibility of ML-based ESMs thus requires us to build evidence of their consistency with the physical system. To this end, this paper puts forward five recommendations to enhance comprehensive, standardized, and independent evaluation of ML-based ESMs to strengthen their credibility and promote their wider use.
Problem

Research questions and friction points this paper is trying to address.

Machine Learning
Earth System Models
Reliability Assessment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine Learning
Earth System Models
Reliability and Physical Consistency
P
Paul A. Ullrich
Lawrence Livermore National Laboratory and University of California Davis
E
Elizabeth A. Barnes
Colorado State University
W
William D. Collins
Lawrence Berkeley National Laboratory
K
Katherine Dagon
NSF National Center for Atmospheric Research
S
Shiheng Duan
Lawrence Livermore National Laboratory
J
Joshua Elms
Indiana University Bloomington
Jiwoo Lee
Jiwoo Lee
Staff Scientist of Lawrence Livermore National Laboratory
ClimateClimate modelingdiagnostic metricsbig-data visualizationnumerical weather prediction
L
L. R. Leung
Pacific Northwest National Laboratory
Dan Lu
Dan Lu
Oak Ridge National Laboratory
Maria J. Molina
Maria J. Molina
University of Maryland, College Park
T
Travis A. O’Brien
Indiana University Bloomington and Lawrence Berkeley National Laboratory
F
F. Rebassoo
Lawrence Livermore National Laboratory