Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the sequential decision-making challenge in power grid topology control, characterized by combinatorial explosion in the action space and costly simulation-based evaluation. The authors propose a reinforcement learning method that integrates a semi-Markov decision process with a physics-informed Gibbs prior. Decision-making is triggered only when the grid enters a risky state, and a graph neural network surrogate predicts post-action overload risk to dynamically construct a state-dependent candidate action set and reweight policy logits, substantially reducing exploration difficulty and online simulation overhead. Notably, this is the first approach to embed a physics-based Gibbs prior directly into the policy selection mechanism, enabling efficient and flexible topology control. Experiments across three benchmark environments demonstrate up to 255% higher cumulative reward, 284% more survival steps, approximately 2.5× faster decision speed, and up to 200× lower simulation cost compared to baseline methods.
📝 Abstract
Topology control for power grid operation is a challenging sequential decision making problem because the action space grows combinatorially with the size of the grid and action evaluation through simulation is computationally expensive. We propose a physics-informed Reinforcement Learning framework that combines semi-Markov control with a Gibbs prior, that encodes the system's physics, over the action space. The decision is only taken when the grid enters a hazardous regime, while a graph neural network surrogate predicts the post action overload risk of feasible topology actions. These predictions are used to construct a physics-informed Gibbs prior that both selects a small state-dependent candidate set and reweights policy logits before action selection. In this way, our method reduces exploration difficulty and online simulation cost while preserving the flexibility of a learned policy. We evaluate the approach in three realistic benchmark environments of increasing difficulty. Across all settings, the proposed method achieves a strong balance between control quality and computational efficiency: it matches oracle-level performance while being approximately $6\times$ faster on the first benchmark, reaches $94.6\%$ of oracle reward with roughly $200\times$ lower decision time on the second one, and on the most challenging benchmark improves over a PPO baseline by up to $255\%$ in reward and $284\%$ in survived steps while remaining about $2.5\times$ faster than a strong specialized engineering baseline. These results show that our method provides an effective mechanism for topology control in power grids.
Problem

Research questions and friction points this paper is trying to address.

topology control
power grids
sequential decision making
combinatorial action space
computational expense
Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-Informed Reinforcement Learning
Gibbs Prior
Topology Control
Graph Neural Network
Semi-Markov Decision Process
🔎 Similar Papers
No similar papers found.
P
Pantelis Dogoulis
SerVal, SnT, University of Luxembourg, Luxembourg City, Luxembourg
Maxime Cordy
Maxime Cordy
University of Luxembourg
Artificial IntelligenceMachine Learning SecurityTesting and VerificationSoftware Engineering