Numerical models outperform AI weather forecasts of record-breaking extremes

📅 2025-08-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
AI-based weather forecasting models (GraphCast, Pangu-Weather, Fuxi) exhibit unknown reliability in predicting record-breaking extreme weather—particularly heatwaves, cold spells, and strong winds—under climate change, raising concerns for early warning and disaster management. Method: We conduct a systematic evaluation of their extrapolation capability against ECMWF’s high-resolution ensemble system (HRES), quantifying biases in frequency, intensity, and lead-time dependence for extremes beyond the training distribution. Results: All three AI models show consistent structural deficiencies: systematic underestimation of heatwave occurrence and intensity, overestimation of cold records, and error amplification with increasing extremity. HRES outperforms each AI model across all forecast lead times. This work provides the first quantitative evidence of inherent limitations in current AI weather models for out-of-distribution extreme-event prediction, establishing a critical benchmark for assessing and improving their robustness and reliability in operational meteorology.

Technology Category

Application Category

📝 Abstract
Artificial intelligence (AI)-based models are revolutionizing weather forecasting and have surpassed leading numerical weather prediction systems on various benchmark tasks. However, their ability to extrapolate and reliably forecast unprecedented extreme events remains unclear. Here, we show that for record-breaking weather extremes, the numerical model High RESolution forecast (HRES) from the European Centre for Medium-Range Weather Forecasts still consistently outperforms state-of-the-art AI models GraphCast, GraphCast operational, Pangu-Weather, Pangu-Weather operational, and Fuxi. We demonstrate that forecast errors in AI models are consistently larger for record-breaking heat, cold, and wind than in HRES across nearly all lead times. We further find that the examined AI models tend to underestimate both the frequency and intensity of record-breaking events, and they underpredict hot records and overestimate cold records with growing errors for larger record exceedance. Our findings underscore the current limitations of AI weather models in extrapolating beyond their training domain and in forecasting the potentially most impactful record-breaking weather events that are particularly frequent in a rapidly warming climate. Further rigorous verification and model development is needed before these models can be solely relied upon for high-stakes applications such as early warning systems and disaster management.
Problem

Research questions and friction points this paper is trying to address.

AI models underperform on record-breaking weather extremes
Numerical models outperform AI in forecasting unprecedented events
AI underestimates frequency and intensity of extreme events
Innovation

Methods, ideas, or system contributions that make the work stand out.

Numerical model HRES outperforms AI forecasts
AI models underestimate frequency and intensity extremes
AI models show growing errors for record exceedance
🔎 Similar Papers