🤖 AI Summary
Mathematical optimization modeling heavily relies on domain experts and suffers from low automation. Method: This work systematically investigates how large language models (LLMs) can empower automated mathematical modeling, focusing on data synthesis, instruction fine-tuning, reasoning framework design, benchmark construction, and evaluation methodology. To address pervasive labeling errors (>40%) in mainstream benchmarks (e.g., OptiMath, MOBench), we conduct the first large-scale manual verification and cleaning, yielding the high-quality OptiClean dataset. Contribution/Results: Based on OptiClean, we establish the first fair, reproducible automated modeling leaderboard; release an open-source repository integrating datasets, code, literature, and an online evaluation platform; and provide a standardized evaluation framework, reliable benchmark, and scalable technical paradigm for LLM-driven modeling automation—significantly advancing the field’s standardization and rigor.
📝 Abstract
By virtue of its great utility in solving real-world problems, optimization modeling has been widely employed for optimal decision-making across various sectors, but it requires substantial expertise from operations research professionals. With the advent of large language models (LLMs), new opportunities have emerged to automate the procedure of mathematical modeling. This survey presents a comprehensive and timely review of recent advancements that cover the entire technical stack, including data synthesis and fine-tuning for the base model, inference frameworks, benchmark datasets, and performance evaluation. In addition, we conducted an in-depth analysis on the quality of benchmark datasets, which was found to have a surprisingly high error rate. We cleaned the datasets and constructed a new leaderboard with fair performance evaluation in terms of base LLM model and datasets. We also build an online portal that integrates resources of cleaned datasets, code and paper repository to benefit the community. Finally, we identify limitations in current methodologies and outline future research opportunities.