🤖 AI Summary
Daily-scale Indian summer monsoon rainfall forecasting has long been hindered by strong nonlinearity and multiscale dynamical drivers. This study proposes a purely data-driven deep learning framework relying solely on historical precipitation observations. Using the 1901–2022 IMD gridded dataset (1°×1°), it employs a hybrid architecture centered on Transformer with auxiliary LSTM components, modeling spatiotemporal dependencies via a 20-day lookback window to predict both daily and 3-day accumulated rainfall. For the monsoon season (June–September), the method systematically outperforms NCEP numerical weather prediction—reducing mean absolute error by 34% (daily) and 68% (3-day)—and persistence forecasting—by 29% and 54%, respectively—demonstrating robust superiority across national and major-city evaluations. The key contribution is the first systematic demonstration that univariate precipitation time series alone encode sufficient predictability to achieve high-accuracy nowcasting of monsoon rainfall without external atmospheric variables.
📝 Abstract
In this draft we consider the problem of forecasting rainfall across India during the four monsoon months, one day as well as three days in advance. We train neural networks using historical daily gridded precipitation data for India obtained from IMD for the time period $1901- 2022$, at a spatial resolution of $1^{circ} imes 1^{circ}$. This is compared with the numerical weather prediction (NWP) forecasts obtained from NCEP (National Centre for Environmental Prediction) available for the period 2011-2022. We conduct a detailed country wide analysis and separately analyze some of the most populated cities in India. Our conclusion is that forecasts obtained by applying deep learning to historical rainfall data are more accurate compared to NWP forecasts as well as predictions based on persistence. On average, compared to our predictions, forecasts from NCEP-NWP model have about 34% higher error for a single day prediction, and over 68% higher error for a three day prediction. Similarly, persistence estimates report a 29% higher error in a single day forecast, and over 54% error in a three day forecast. We further observe that data up to 20 days in the past is useful in reducing errors of one and three day forecasts, when a transformer based learning architecture, and to a lesser extent when an LSTM is used. A key conclusion suggested by our preliminary analysis is that NWP forecasts can be substantially improved upon through more and diverse data relevant to monsoon prediction combined with carefully selected neural network architecture.