AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library

📅 2025-10-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automating the translation of informal optimization requirements into formal mathematical models and executable solver code remains a significant challenge. Method: This paper proposes a two-stage self-evolving framework. In Stage I, a large language model (LLM) leverages solver feedback and retrieval-augmented generation to construct a structured experience repository—without labeled demonstrations, using only ground-truth solutions. In Stage II, knowledge is extracted from failure cases to dynamically refine applicability conditions, enabling explicit, interpretable, and fine-tuning-free continual learning. Contribution/Results: The core innovation lies in modeling experience repository evolution as a condition-driven self-improvement process, supporting cross-task generalization. Evaluated on OptiBench, our method outperforms the strongest baseline by 7.7%. Accuracy improves from 65% to 72% as training samples increase from 100 to 300, demonstrating both effectiveness and scalability.

Technology Category

Application Category

📝 Abstract
Optimization modeling enables critical decisions across industries but remains difficult to automate: informal language must be mapped to precise mathematical formulations and executable solver code. Prior LLM approaches either rely on brittle prompting or costly retraining with limited generalization. We present AlphaOPT, a self-improving experience library that enables an LLM to learn from limited demonstrations (even answers alone, without gold-standard programs) and solver feedback - without annotated reasoning traces or parameter updates. AlphaOPT operates in a continual two-phase cycle: (i) a Library Learning phase that reflects on failed attempts, extracting solver-verified, structured insights as {taxonomy, condition, explanation, example}; and (ii) a Library Evolution phase that diagnoses retrieval misalignments and refines the applicability conditions of stored insights, improving transfer across tasks. This design (1) learns efficiently from limited demonstrations without curated rationales, (2) expands continually without costly retraining by updating the library rather than model weights, and (3) makes knowledge explicit and interpretable for human inspection and intervention. Experiments show that AlphaOPT steadily improves with more data (65% to 72% from 100 to 300 training items) and surpasses the strongest baseline by 7.7% on the out-of-distribution OptiBench dataset when trained only on answers. Code and data are available at: https://github.com/Minw913/AlphaOPT.
Problem

Research questions and friction points this paper is trying to address.

Automating optimization modeling from informal language to mathematical formulations
Learning from limited demonstrations without annotated reasoning traces
Continually improving performance without costly model retraining through experience library
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-improving experience library learns from limited demonstrations and solver feedback
Two-phase cycle with library learning and evolution without model retraining
Explicit interpretable knowledge enables human inspection and intervention
🔎 Similar Papers
No similar papers found.
M
Minwei Kong
London School of Economics and Political Science
Ao Qu
Ao Qu
Massachusetts Institute of Technology
Language AgentMultisensory AIComputational Social Science
Xiaotong Guo
Xiaotong Guo
Ph.D. in Transportation, MIT
Transportation ModelingOptimizationShared MobilityPublic Transit
W
Wenbin Ouyang
Massachusetts Institute of Technology
C
Chonghe Jiang
Massachusetts Institute of Technology
H
Han Zheng
Massachusetts Institute of Technology
Yining Ma
Yining Ma
Postdoctoral Associate, MIT
Machine LearningOptimizationLearning to OptimizeNeural Combinatorial Optimization
D
Dingyi Zhuang
Massachusetts Institute of Technology
Y
Yuhan Tang
Singapore-MIT Alliance for Research and Technology
J
Junyi Li
Singapore-MIT Alliance for Research and Technology
H
Hai Wang
Singapore Management University
C
Cathy Wu
Massachusetts Institute of Technology
J
Jinhua Zhao
Singapore-MIT Alliance for Research and Technology