LLMs&XAI for Water Sustainability: Seasonal Water Quality Prediction with LIME Explainable AI and a RAG-based Chatbot for Insights

📅 2024-09-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

168K/year
🤖 AI Summary
To address the challenges of water quality prediction and interpretable monitoring under small-sample, highly seasonal conditions in developing countries like Nepal, this study proposes a lightweight hybrid modeling framework. Methodologically, it introduces a novel multi-source feature fusion architecture integrating CNN-RNN and tree-based models (e.g., CatBoost, XGBoost); pioneers the application of LIME for local interpretability in water quality classification, enabling attribution of decisions to key pollution factors; and constructs the world’s first RAG-enhanced, water sustainability–oriented QA system for water quality. Experimental results show an RMSE of 1.2 (R² = 0.99) for WQI regression and classification accuracies of 99% (ensemble) and 92% (neural network, R² = 0.97). The system supports real-time forecasting, attribution visualization, and natural-language water quality querying, significantly enhancing intelligent water safety decision-making in resource-constrained settings.

Technology Category

Application Category

📝 Abstract
Ensuring safe water supplies requires effective water quality monitoring, especially in developing countries like Nepal, where contamination risks are high. This paper introduces a hybrid deep learning model to predict Nepal's seasonal water quality using a small dataset with multiple water quality parameters. Models such as CatBoost, XGBoost, Extra Trees, and LightGBM, along with a neural network combining CNN and RNN layers, are used to capture temporal and spatial patterns in the data. The model demonstrated notable accuracy improvements, aiding proactive water quality control. CatBoost, XGBoost, and Extra Trees Regressor predicted Water Quality Index (WQI) values with an average RMSE of 1.2 and an R2 score of 0.99. Additionally, classifiers achieved 99 percent accuracy, cross-validated across models. LIME analysis highlighted the importance of indicators like EC and DO levels in XGBoost classification decisions. The neural network model achieved 92 percent classification accuracy and an R2 score of 0.97, with an RMSE of 2.87 in regression analysis. Furthermore, a multifunctional application was developed to predict WQI values using both regression and classification methods.
Problem

Research questions and friction points this paper is trying to address.

Water Quality Prediction
Seasonal Monitoring
Environmental Risk
Innovation

Methods, ideas, or system contributions that make the work stand out.

Interpretable AI
Neural Networks
Water Quality Prediction
🔎 Similar Papers
No similar papers found.