Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids

📅 2026-04-02

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the sequential decision-making challenge in power grid topology control, characterized by combinatorial explosion in the action space and costly simulation-based evaluation. The authors propose a reinforcement learning method that integrates a semi-Markov decision process with a physics-informed Gibbs prior. Decision-making is triggered only when the grid enters a risky state, and a graph neural network surrogate predicts post-action overload risk to dynamically construct a state-dependent candidate action set and reweight policy logits, substantially reducing exploration difficulty and online simulation overhead. Notably, this is the first approach to embed a physics-based Gibbs prior directly into the policy selection mechanism, enabling efficient and flexible topology control. Experiments across three benchmark environments demonstrate up to 255% higher cumulative reward, 284% more survival steps, approximately 2.5× faster decision speed, and up to 200× lower simulation cost compared to baseline methods.

Technology Category

Application Category

📝 Abstract

Topology control for power grid operation is a challenging sequential decision making problem because the action space grows combinatorially with the size of the grid and action evaluation through simulation is computationally expensive. We propose a physics-informed Reinforcement Learning framework that combines semi-Markov control with a Gibbs prior, that encodes the system's physics, over the action space. The decision is only taken when the grid enters a hazardous regime, while a graph neural network surrogate predicts the post action overload risk of feasible topology actions. These predictions are used to construct a physics-informed Gibbs prior that both selects a small state-dependent candidate set and reweights policy logits before action selection. In this way, our method reduces exploration difficulty and online simulation cost while preserving the flexibility of a learned policy. We evaluate the approach in three realistic benchmark environments of increasing difficulty. Across all settings, the proposed method achieves a strong balance between control quality and computational efficiency: it matches oracle-level performance while being approximately $6\times$ faster on the first benchmark, reaches $94.6\%$ of oracle reward with roughly $200\times$ lower decision time on the second one, and on the most challenging benchmark improves over a PPO baseline by up to $255\%$ in reward and $284\%$ in survived steps while remaining about $2.5\times$ faster than a strong specialized engineering baseline. These results show that our method provides an effective mechanism for topology control in power grids.

Problem

Research questions and friction points this paper is trying to address.

topology control

power grids

sequential decision making

combinatorial action space

computational expense

Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-Informed Reinforcement Learning

Gibbs Prior

Topology Control