🤖 AI Summary
This work addresses the autoformulation problem—automatically translating natural-language problem descriptions into solvable mathematical optimization models. We propose the first LLM-driven Monte Carlo Tree Search (MCTS) framework for this task, enabling dynamic hypothesis generation and formal correctness evaluation. Our method integrates hierarchical optimization modeling representations, LLM-based semantic understanding, and MCTS-based search strategies. A key innovation is an equivalence-aware pruning mechanism that reduces search overhead by over 40%. Empirically, our approach achieves state-of-the-art performance on LP/MIP benchmarks, outperforming all existing baselines. LLM-assisted verification accelerates correctness assessment significantly. Moreover, this work formally defines the autoformulation task for the first time, establishing a scalable, automated paradigm to lower the barrier to optimization modeling and empower domain experts.
📝 Abstract
Mathematical optimization is fundamental to decision-making across diverse domains, from operations research to healthcare. Yet, translating real-world problems into optimization models remains a formidable challenge, often demanding specialized expertise. This paper formally introduces the concept of $ extbf{autoformulation}$ -- an automated approach to creating optimization models from natural language descriptions for commercial solvers. We identify the three core challenges of autoformulation: (1) defining the vast, problem-dependent hypothesis space, (2) efficiently searching this space under uncertainty, and (3) evaluating formulation correctness (ensuring a formulation accurately represents the problem). To address these challenges, we introduce a novel method leveraging $ extit{Large Language Models}$ (LLMs) within a $ extit{Monte-Carlo Tree Search}$ framework. This approach systematically explores the space of possible formulations by exploiting the hierarchical nature of optimization modeling. LLMs serve two key roles: as dynamic formulation hypothesis generators and as evaluators of formulation correctness. To enhance search efficiency, we introduce a pruning technique to remove trivially equivalent formulations. Empirical evaluations across benchmarks containing linear and mixed-integer programming problems demonstrate our method's superior performance. Additionally, we observe significant efficiency gains from employing LLMs for correctness evaluation and from our pruning techniques.