Can AI weather models predict out-of-distribution gray swan tropical cyclones?

📅 2024-10-19

🏛️ arXiv.org

📈 Citations: 7

✨ Influential: 1

career value

165K/year

🤖 AI Summary

This study investigates the extrapolation capability of AI-based weather models for unprecedented rare extreme events—specifically Category 5 “gray swan” tropical cyclones—completely absent from training data. Building upon the FourCastNet architecture, we train models on ERA5 reanalysis data (1979–2015) and evaluate forecasting skill on independent test periods (2018–2023). Through systematic ablation experiments, we globally or regionally exclude Category 3–5 cyclone samples during retraining and quantitatively assess model performance on unseen Category 5 storms. Results reveal a fundamental limitation: models cannot extrapolate from weaker cyclones to Category 5 intensity. However, when strong cyclones are removed from a single ocean basin, the model retains significant predictive skill within that basin—demonstrating, for the first time, cross-basin generalization in AI weather models. This suggests that implicitly learned regional climate covariabilities support localized extreme-event prediction despite training-data gaps.

Technology Category

Application Category

📝 Abstract

Predicting gray swan weather extremes, which are possible but so rare that they are absent from the training dataset, is a major concern for AI weather models and long-term climate emulators. An important open question is whether AI models can extrapolate from weaker weather events present in the training set to stronger, unseen weather extremes. To test this, we train independent versions of the AI model FourCastNet on the 1979-2015 ERA5 dataset with all data, or with Category 3-5 tropical cyclones (TCs) removed, either globally or only over the North Atlantic or Western Pacific basin. We then test these versions of FourCastNet on 2018-2023 Category 5 TCs (gray swans). All versions yield similar accuracy for global weather, but the one trained without Category 3-5 TCs cannot accurately forecast Category 5 TCs, indicating that these models cannot extrapolate from weaker storms. The versions trained without Category 3-5 TCs in one basin show some skill forecasting Category 5 TCs in that basin, suggesting that FourCastNet can generalize across tropical basins. This is encouraging and surprising because regional information is implicitly encoded in inputs. Given that current state-of-the-art AI weather and climate models have similar learning strategies, we expect our findings to apply to other models. Other types of weather extremes need to be similarly investigated. Our work demonstrates that novel learning strategies are needed for AI models to reliably provide early warning or estimated statistics for the rarest, most impactful TCs, and, possibly, other weather extremes.

Problem

Research questions and friction points this paper is trying to address.

Can AI models predict rare gray swan tropical cyclones?

Do AI models extrapolate from weak to extreme weather events?

Can AI generalize forecasting skills across different tropical basins?

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training AI models with varied cyclone data exclusion

Testing generalization across tropical cyclone basins

Highlighting need for novel learning strategies

🔎 Similar Papers

No similar papers found.