Exploring Design Choices for Autoregressive Deep Learning Climate Models

📅 2025-05-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
While deep learning (DL) weather models excel in medium- to short-range forecasting, their long-term (>14 days) integrations often suffer from physical inconsistency and numerical instability—unlike traditional atmospheric models, which remain stable over decades. Method: This study systematically evaluates the multi-decadal stability of three autoregressive DL climate models—FourCastNet, SFNO, and ClimaX—via continuous 10-year integrations using ERA5 reanalysis data (5.625° resolution). We quantitatively assess how architectural design, variable selection, training steps, model capacity, and stochasticity affect statistical fidelity and physical consistency. Contribution/Results: We identify, for the first time, key configurable factors governing long-term stability in DL climate models. SFNO demonstrates superior hyperparameter robustness; variable composition and random seed emerge as dominant latent sources of instability. We precisely delineate each model’s stability boundary and provide reproducible, empirically validated configurations enabling stable 10-year integrations.

Technology Category

Application Category

📝 Abstract
Deep Learning models have achieved state-of-the-art performance in medium-range weather prediction but often fail to maintain physically consistent rollouts beyond 14 days. In contrast, a few atmospheric models demonstrate stability over decades, though the key design choices enabling this remain unclear. This study quantitatively compares the long-term stability of three prominent DL-MWP architectures - FourCastNet, SFNO, and ClimaX - trained on ERA5 reanalysis data at 5.625{deg} resolution. We systematically assess the impact of autoregressive training steps, model capacity, and choice of prognostic variables, identifying configurations that enable stable 10-year rollouts while preserving the statistical properties of the reference dataset. Notably, rollouts with SFNO exhibit the greatest robustness to hyperparameter choices, yet all models can experience instability depending on the random seed and the set of prognostic variables
Problem

Research questions and friction points this paper is trying to address.

Evaluating long-term stability of deep learning climate models
Identifying key design choices for stable decade-long rollouts
Comparing robustness of three DL-MWP architectures (FourCastNet, SFNO, ClimaX)
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoregressive training steps enhance long-term stability
Model capacity impacts 10-year rollout performance
Choice of prognostic variables affects statistical consistency
🔎 Similar Papers
No similar papers found.
F
Florian Gallusser
Data Science Chair, Center for Artificial Intelligence and Data Science (CAIDAS), University of Würzburg
S
Simon Hentschel
Data Science Chair, Center for Artificial Intelligence and Data Science (CAIDAS), University of Würzburg
Anna Krause
Anna Krause
Data Science Chair, Center for Artificial Intelligence and Data Science (CAIDAS), University of Würzburg
Andreas Hotho
Andreas Hotho
University of Würzburg
Data ScienceNLPKnowledge GraphsML for Environmental ScienceML for Network Security and Fraud