Balancing Performance and Fairness in Explainable AI for Anomaly Detection in Distributed Power Plants Monitoring

📅 2026-03-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes an ensemble learning framework that jointly optimizes performance, interpretability, and cross-regional fairness to address three key challenges in anomaly detection for distributed power plants: extreme class imbalance, model opacity, and lack of regional fairness. The framework mitigates data imbalance via SMOTE-Tomek/ENN resampling, provides feature-level interpretability using SHAP values, and incorporates Disparate Impact Ratio (DIR) and Maximum Mean Discrepancy (MMD) to evaluate regional fairness and cross-domain generalization, respectively. Evaluated on diesel generator data, the LightGBM/XGBoost-based approach achieves an F1-score of 0.99 and a DIR of approximately 0.95, identifying fuel consumption rate and daily operating duration as critical predictive features. The solution further supports low-latency, containerized deployment for real-time inference.

Technology Category

Application Category

📝 Abstract
Reliable anomaly detection in distributed power plant monitoring systems is essential for ensuring operational continuity and reducing maintenance costs, particularly in regions where telecom operators heavily rely on diesel generators. However, this task is challenged by extreme class imbalance, lack of interpretability, and potential fairness issues across regional clusters. In this work, we propose a supervised ML framework that integrates ensemble methods (LightGBM, XGBoost, Random Forest, CatBoost, GBDT, AdaBoost) and baseline models (Support Vector Machine, K-Nearrest Neighbors, Multilayer Perceptrons, and Logistic Regression) with advanced resampling techniques (SMOTE with Tomek Links and ENN) to address imbalance in a dataset of diesel generator operations in Cameroon. Interpretability is achieved through SHAP (SHapley Additive exPlanations), while fairness is quantified using the Disparate Impact Ratio (DIR) across operational clusters. We further evaluate model generalization using Maximum Mean Discrepancy (MMD) to capture domain shifts between regions. Experimental results show that ensemble models consistently outperform baselines, with LightGBM achieving an F1-score of 0.99 and minimal bias across clusters (DIR $\approx 0.95$). SHAP analysis highlights fuel consumption rate and runtime per day as dominant predictors, providing actionable insights for operators. Our findings demonstrate that it is possible to balance performance, interpretability, and fairness in anomaly detection, paving the way for more equitable and explainable AI systems in industrial power management. {\color{black} Finally, beyond offline evaluation, we also discuss how the trained models can be deployed in practice for real-time monitoring. We show how containerized services can process in real-time, deliver low-latency predictions, and provide interpretable outputs for operators.
Problem

Research questions and friction points this paper is trying to address.

anomaly detection
class imbalance
fairness
explainable AI
distributed power plants
Innovation

Methods, ideas, or system contributions that make the work stand out.

Explainable AI
Fairness-aware Learning
Anomaly Detection
Class Imbalance
SHAP Interpretability
🔎 Similar Papers
No similar papers found.
C
Corneille Niyonkuru
African Institute for Mathematical Sciences (AIMS), Kigali, Rwanda
Marcellin Atemkeng
Marcellin Atemkeng
Associate Professor of Applied Mathematics & Machine Learning, Rhodes University
Big DataStatistical Signal ProcessingeXplainable AIDeep LearningRadio Astronomy
G
Gabin Maxime Nguegnang
Department of Mathematics, Ludwig Maximilian University of Munich, Bavaria, 80333, Germany
A
Arnaud Nguembang Fadja
Department of Engineering, University of Ferrara, Via Saragat 1, Ferrara, 44122, Italy