🤖 AI Summary
This work addresses the challenge that large language models often generate incorrect or inefficient optimization modeling code due to a lack of effective modeling strategies. To overcome this limitation, the authors propose SAGE, a novel framework that explicitly incorporates modeling strategies into data construction and training. SAGE constructs a multi-strategy dataset validated by solvers, combines supervised fine-tuning with segmented weighted GRPO reinforcement learning, and introduces a composite reward mechanism. The approach significantly improves the correctness, syntactic compliance, and solver efficiency of generated code, achieving an average pass@1 of 80.3% across eight benchmarks. Furthermore, it enhances component-level diversity by 19–29% under pass@16 and reduces constraint system size by 14.2%.
📝 Abstract
Large language models (LLMs) can generate syntactically valid optimization programs, yet often struggle to reliably choose an effective modeling strategy, leading to incorrect formulations and inefficient solver behavior. We propose SAGE, a strategy-aware framework that makes Modeling Strategy explicit in both data construction and post-training. SAGE builds a solver-verified multi-strategy dataset and trains a student model with supervised fine-tuning followed by Segment-Weighted GRPO using a composite reward over format compliance, correctness, and solver efficiency. Across eight benchmarks spanning synthetic and real-world settings, SAGE improves average pass@1 from 72.7 to 80.3 over the strongest open-source baseline. With multiple generations, SAGE discovers more distinct correct formulations and improves component-level diversity at pass@16 by 19-29%. At the largest scale, SAGE produces more compact constraint systems with 14.2% fewer constraints than the baseline, consistent with solver-efficient modeling. Overall, these results show that making Modeling Strategy explicit improves automated optimization modeling. Code is available at https://github.com/rachhhhing/SAGE.