🤖 AI Summary
Sub-synchronous control interactions (SSCI) in wind/PV grid-connected systems, triggered by controller gain mismatches, threaten small-signal stability and transient robustness.
Method: This paper proposes a deep reinforcement learning (DRL)-based adaptive gain tuning method. An electromagnetic transient (EMT) closed-loop simulation framework is established, integrating the proximal policy optimization (PPO) algorithm with a customized signal processing module—comprising downsampling, band-pass filtering, and oscillation energy quantification for reward shaping—to enable real-time SSCI detection and dynamic gain optimization.
Contribution/Results: Validated via PSCAD–Python co-simulation under realistic fault scenarios, the method online adjusts converter controller gains according to grid conditions, effectively suppressing sub-synchronous oscillations in the 0.1–50 Hz range. It enhances both small-signal stability and transient response robustness. To the best of our knowledge, this is the first work to introduce end-to-end DRL-based closed-loop control for active SSCI mitigation, establishing a novel paradigm for adaptive wideband oscillation defense.
📝 Abstract
This paper explores the development of learning-based tunable control gains using EMT-in-the-loop simulation framework (e.g., PSCAD interfaced with Python-based learning modules) to address critical sub-synchronous oscillations. Since sub-synchronous control interactions (SSCI) arise from the mis-tuning of control gains under specific grid configurations, effective mitigation strategies require adaptive re-tuning of these gains. Such adaptiveness can be achieved by employing a closed-loop, learning-based framework that considers the grid conditions responsible for such sub-synchronous oscillations. This paper addresses this need by adopting methodologies inspired by Markov decision process (MDP) based reinforcement learning (RL), with a particular emphasis on simpler deep policy gradient methods with additional SSCI-specific signal processing modules such as down-sampling, bandpass filtering, and oscillation energy dependent reward computations. Our experimentation in a real-world event setting demonstrates that the deep policy gradient based trained policy can adaptively compute gain settings in response to varying grid conditions and optimally suppress control interaction-induced oscillations.