🤖 AI Summary
Network uncertainties—particularly communication delays—severely degrade the robustness of wide-area damping control (WADC). Method: This paper proposes a communication-aware, risk-constrained reinforcement learning framework that embeds a mean–variance risk metric into the LQR optimization objective. Unlike conventional approaches, it requires no precise delay estimation and inherently accommodates multiple uncertainties, including link failures and network disturbances. Leveraging a co-simulation model of synchronous generators and voltage-source converters (VSCs), the framework employs zeroth-order policy gradient and SGDmax to solve the risk-constrained optimization. Results: Validated on the IEEE 68-bus system, the method demonstrates stable convergence and significantly enhances supplementary VSC damping capability. Compared with traditional delay-compensation methods, it achieves superior oscillation suppression even under erroneous delay estimates.
📝 Abstract
Non-ideal communication links, especially delays, critically affect fast networked controls in power systems, such as the wide-area damping control (WADC). Traditionally, a delay estimation and compensation approach is adopted to address this cyber-physical coupling, but it demands very high accuracy for the fast WADC and cannot handle other cyber concerns like link failures or {cyber perturbations}. Hence, we propose a new risk-constrained framework that can target the communication delays, yet amenable to general uncertainty under the cyber-physical couplings. Our WADC model includes the synchronous generators (SGs), and also voltage source converters (VSCs) for additional damping capabilities. To mitigate uncertainty, a mean-variance risk constraint is introduced to the classical optimal control cost of the linear quadratic regulator (LQR). Unlike estimating delays, our approach can effectively mitigate large communication delays by improving the worst-case performance. A reinforcement learning (RL)-based algorithm, namely, stochastic gradient-descent with max-oracle (SGDmax), is developed to solve the risk-constrained problem. We further show its guaranteed convergence to stationarity at a high probability, even using the simple zero-order policy gradient (ZOPG). Numerical tests on the IEEE 68-bus system not only verify SGDmax's convergence and VSCs' damping capabilities, but also demonstrate that our approach outperforms conventional delay compensator-based methods under estimation error. While focusing on performance improvement under large delays, our proposed risk-constrained design can effectively mitigate the worst-case oscillations, making it equally effective for addressing other communication issues and cyber perturbations.