🤖 AI Summary
This study addresses the gap in current AI weather forecasting evaluation, which predominantly emphasizes meteorological metrics while overlooking the real-world decision-making needs of users in low-income regions during high-impact weather events. The authors propose the first decision-oriented evaluation framework that integrates meteorology, artificial intelligence, and social science, embedding stakeholder requirements from rainfed agricultural communities in India into the AI forecast assessment pipeline. Leveraging open-source AI models and combining deterministic and probabilistic metrics, the framework enables out-of-sample, regional-scale prediction of the monsoon onset—a critical indicator for agricultural planning. Validated in practice, it successfully underpinned the government’s 2025 rollout of AI-driven monsoon forecasts to 38 million Indian farmers, accurately capturing rare multi-week stagnation events during monsoon progression and thereby advancing AI models from technical performance toward tangible societal impact.
📝 Abstract
Artificial intelligence weather prediction (AIWP) models now often outperform traditional physics-based models on common metrics while requiring orders-of-magnitude less computing resources and time. Open-access AIWP models thus hold promise as transformational tools for helping low- and middle-income populations make decisions in the face of high-impact weather shocks. Yet, current approaches to evaluating AIWP models focus mainly on aggregated meteorological metrics without considering local stakeholders'needs in decision-oriented, operational frameworks. Here, we introduce such a framework that connects meteorology, AI, and social sciences. As an example, we apply it to the 150-year-old problem of Indian monsoon forecasting, focusing on benefits to rain-fed agriculture, which is highly susceptible to climate change. AIWP models skillfully predict an agriculturally relevant onset index at regional scales weeks in advance when evaluated out-of-sample using deterministic and probabilistic metrics. This framework informed a government-led effort in 2025 to send 38 million Indian farmers AI-based monsoon onset forecasts, which captured an unusual weeks-long pause in monsoon progression. This decision-oriented benchmarking framework provides a key component of a blueprint for harnessing the power of AIWP models to help large vulnerable populations adapt to weather shocks in the face of climate variability and change.