🤖 AI Summary
This work addresses the challenge posed by drastic grid topology changes during Public Safety Power Shutoff (PSPS) events, which render conventional operating points infeasible. To tackle this, the authors propose a multi-stage fine-tuning framework that enables instruction-tuned large language models to generate feasible open-only corrective switching actions from compact scenario summaries while respecting switch operation budgets. The approach integrates knowledge distillation from DC optimal power flow, preference optimization via AC power flow–based voltage penalties (DPO), and a Best-of-N decoding strategy during inference. Feasibility is further ensured through distillation from a MILP oracle and structured action grammars. Evaluated on the IEEE 118-bus system, the method reduces AC power flow failure rates from 50% under zero-shot generation to single digits, significantly improves DC objective values, and enhances voltage stability over the common set of successfully solved instances.
📝 Abstract
Public Safety Power Shutoffs (PSPS) force rapid topology changes that can render standard operating points infeasible, requiring operators to quickly identify corrective transmission switching actions that reduce load shedding while maintaining acceptable voltage behavior. We present a verifiable, multi-stage adaptation pipeline that fine-tunes an instruction-tuned large language model (LLM) to generate \emph{open-only} corrective switching plans from compact PSPS scenario summaries under an explicit switching budget. First, supervised fine-tuning distills a DC-OPF MILP oracle into a constrained action grammar that enables reliable parsing and feasibility checks. Second, direct preference optimization refines the policy using AC-evaluated preference pairs ranked by a voltage-penalty metric, injecting voltage-awareness beyond DC imitation. Finally, best-of-$N$ selection provides an inference-time addition by choosing the best feasible candidate under the target metric. On IEEE 118-bus PSPS scenarios, fine-tuning substantially improves DC objective values versus zero-shot generation, reduces AC power-flow failure from 50\% to single digits, and improves voltage-penalty outcomes on the common-success set. Code and data-generation scripts are released to support reproducibility.