🤖 AI Summary
Distinguishing cyber-attacks from physical faults in inverter-based grids remains challenging due to their similar transient signatures and the lack of real-time discrimination capabilities. Method: This paper proposes a high-fidelity, electromagnetic transient–digital substation co-simulation-driven streaming machine learning classification framework. It introduces a recursive length-smoothing filter and a dynamic confidence-thresholding mechanism to enhance online decision stability, and establishes the first evaluation paradigm tailored to realistic streaming environments—exposing substantial discrepancies between offline accuracy and operational performance. Contribution/Results: Evaluated systematically on 4.8-kHz real-world time-domain data across 12 models, the multilayer perceptron achieves 98%–99% coverage with high precision, whereas ensemble models, though highly accurate for anomaly detection, attain only 10%–49% coverage. Results demonstrate significant architectural sensitivity to grid dynamics, validating differential adaptability of ML models to IBR-rich systems and delivering a deployable inference pipeline for security situational awareness.
📝 Abstract
This paper presents a high-fidelity evaluation framework for machine learning (ML)-based classification of cyber-attacks and physical faults using electromagnetic transient simulations with digital substation emulation at 4.8 kHz. Twelve ML models, including ensemble algorithms and a multi-layer perceptron (MLP), were trained on labeled time-domain measurements and evaluated in a real-time streaming environment designed for sub-cycle responsiveness. The architecture incorporates a cycle-length smoothing filter and confidence threshold to stabilize decisions. Results show that while several models achieved near-perfect offline accuracies (up to 99.9%), only the MLP sustained robust coverage (98-99%) under streaming, whereas ensembles preserved perfect anomaly precision but abstained frequently (10-49% coverage). These findings demonstrate that offline accuracy alone is an unreliable indicator of field readiness and underscore the need for realistic testing and inference pipelines to ensure dependable classification in inverter-based resources (IBR)-rich networks.