ProOPF: Benchmarking and Improving LLMs for Professional-Grade Power Systems Optimization Modeling

📅 2026-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the operational uncertainties introduced by high-penetration renewable energy integration and the lack of high-quality evaluation benchmarks for large language models (LLMs) in professional-grade optimal power flow (OPF) modeling. To bridge this gap, the authors propose the NL-to-OPF framework, which integrates natural language processing, LLMs, and domain expertise to automatically generate and verify executable OPF code from natural-language descriptions of dispatch requirements. The core contribution is the first-ever OPF modeling dataset and benchmark tailored to power system scenarios—ProOPF-D and ProOPF-B—comprising 12K training samples and 121 expert-annotated test cases. This benchmark enables end-to-end evaluation of both concrete and abstract OPF modeling tasks, establishing a rigorous and reproducible standard for assessing LLM capabilities in power system optimization.

Technology Category

Application Category

📝 Abstract
Growing renewable penetration introduces substantial uncertainty into power system operations, necessitating frequent adaptation of dispatch objectives and constraints and challenging expertise-intensive, near-real-time modeling workflows. Large Language Models (LLMs) provide a promising avenue for automating this process by translating natural-language (NL) operational requirements into executable optimization models via semantic reasoning and code synthesis. Yet existing LLM datasets and benchmarks for optimization modeling primarily target coarse-grained cross-domain generalization, offering limited, rigorous evaluation in power-system settings, particularly for Optimal Power Flow (OPF). We therefore introduce \textbf{ProOPF-D} and \textbf{ProOPF-B}, a dataset and benchmark for professional-grade OPF modeling: ProOPF-D contains 12K instances pairing NL requests with parameter adjustments and structural extensions to a canonical OPF, together with executable implementations; ProOPF-B provides 121 expert-annotated test cases with ground-truth code, enabling end-to-end evaluation under both concrete and abstract OPF modeling regimes.
Problem

Research questions and friction points this paper is trying to address.

Optimal Power Flow
Large Language Models
Power Systems Optimization
Natural Language to Code
Benchmarking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
Optimal Power Flow
Power Systems Optimization
Natural Language to Code
Benchmark Dataset
🔎 Similar Papers
No similar papers found.