🤖 AI Summary
To address insufficient feature utilization, unsystematic hyperparameter optimization, and weak model fusion strategies in rainfall prediction, this paper proposes an end-to-end robust ensemble learning framework. Methodologically, it incorporates novel meteorological features—such as temperature–humidity differentials—and conducts the first systematic evaluation of state-of-the-art models including Kolmogorov–Arnold Networks (KANs). It further introduces PCA-enhanced feature reconstruction and a multi-model ensemble voting mechanism. The pipeline integrates outlier detection, multiple imputation, PCA-based dimensionality reduction, grid-search hyperparameter tuning, and ensemble integration of KAN, SVM, and XGBoost. Evaluated on real-world meteorological datasets, the framework achieves an 18.7% reduction in mean absolute error over conventional methods, demonstrating significant improvements in predictive accuracy, robustness, and generalization capability.
📝 Abstract
Rainfall prediction remains a persistent challenge due to the highly nonlinear and complex nature of meteorological data. Existing approaches lack systematic utilization of grid search for optimal hyperparameter tuning, relying instead on heuristic or manual selection, frequently resulting in sub-optimal results. Additionally, these methods rarely incorporate newly constructed meteorological features such as differences between temperature and humidity to capture critical weather dynamics. Furthermore, there is a lack of systematic evaluation of ensemble learning techniques and limited exploration of diverse advanced models introduced in the past one or two years. To address these limitations, we propose a robust ensemble learning grid search-tuned framework (RAINER) for rainfall prediction. RAINER incorporates a comprehensive feature engineering pipeline, including outlier removal, imputation of missing values, feature reconstruction, and dimensionality reduction via Principal Component Analysis (PCA). The framework integrates novel meteorological features to capture dynamic weather patterns and systematically evaluates non-learning mathematical-based methods and a variety of machine learning models, from weak classifiers to advanced neural networks such as Kolmogorov-Arnold Networks (KAN). By leveraging grid search for hyperparameter tuning and ensemble voting techniques, RAINER achieves promising results within real-world datasets.