🤖 AI Summary
Current large language models lack mechanisms to verify the structural soundness and solution validity of automatically generated mathematical optimization models, which limits their modeling accuracy. This work proposes Opt-Verifier, a novel framework that introduces a dual-loop verification mechanism by jointly assessing generated models along two dimensions: structural consistency and solution validity. By enforcing rigorous cross-checks between model formulation and feasible solutions, Opt-Verifier significantly enhances the logical coherence and mathematical correctness of the generated models. Empirical evaluations on standard benchmarks demonstrate that this approach improves modeling accuracy by over 20%, establishing a new paradigm for reliable automated optimization modeling.
📝 Abstract
Building mathematical optimization models is critical in operations research (OR), while it requires substantial human expertise. Recent advancements have utilized large language models (LLMs) to automate this modeling process. However, existing works often struggle to verify the correctness of the generated optimization models, without checking the rationality of the constraints and variables or the validity of solutions to the generated models. This hampers the subsequent verification and correction steps, and thus it severely hurts the modeling accuracy. To address this challenge, we propose a novel LLM-based framework with Dual-side Verification (Opt-Verifier) from both structure and solution perspectives, thereby improving the modeling accuracy. The structure-side verification ensures that the modeling structure of the generated optimization models aligns with the original problem description, accurately capturing the problem's constraints and requirements. Meanwhile, the solution-side verification interprets and evaluates the solutions' validity, confirming that the optimization models are logically and mathematically sound. Experiments on popular benchmarks demonstrate that our approach achieves over 20\% improvement in accuracy.