🤖 AI Summary
This study addresses the challenge of distinguishing between physical faults and false data injection attacks in IoT-enabled smart grids, which often manifest as indistinguishable anomalies. To tackle this issue, the authors propose a lightweight detection framework that integrates genetic algorithms with tree-based ensemble models—specifically Extra Trees, XGBoost, and Random Forest—to enable metaheuristic-driven feature selection for dimensionality reduction of PMU/IED measurements. Evaluated on the MSU/ORNL power system attack dataset, the proposed GA+Extra Trees approach reduces the feature dimensionality from 112 to an average of 27.4 while achieving a macro-F1 score of 0.9212 and a ROC-AUC of 0.9837. The results demonstrate that the method not only significantly mitigates feature redundancy but also enhances both detection performance and model interpretability.
📝 Abstract
Modern smart grids rely on dense measurement infrastructures, communication links, and intelligent field devices. Although this improves supervision and control, it also increases vulnerability to cyber-physical disruptions. Operators must distinguish physical incidents, such as faults or line disturbances, from malicious actions, such as false data injection or unauthorized command execution. This chapter investigates this problem using the well-known MSU/ORNL Power System Attack Dataset. The proposed method combines machine learning with genetic-algorithm-based feature selection. The objective is twofold: to classify attack and natural events accurately, and to determine whether a reduced set of physically informative PMU/IED measurements can support reliable detection. Several baseline models are evaluated, including logistic regression, RBF-SVM, XGBoost, Random Forest, and Extra Trees. The results show that tree-based ensemble models are the most effective for the considered dataset, with Extra Trees providing the strongest full-feature baseline. After feature selection, the GA + Extra Trees model reduces the clean PMU feature space from 112 attributes to an average of 27.4 attributes over five runs, while increasing macro-F1 from 0.9118 to 0.9212 and ROC-AUC from 0.9791 to 0.9837. These results indicate that many synchronized electrical measurements are redundant. A compact subset of phasor-based features can still provide accurate and interpretable anomaly detection in smart grids.