Models Can Model, But Can't Bind: Structured Grounding in Text-to-Optimization

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses the significant performance degradation of large language models in text-to-optimization tasks when handling large-scale instances, primarily due to inaccurate binding of problem parameters. It is the first to explicitly disentangle modeling capability from binding capability and introduces BIND—a method that externalizes numerical parameters into structured files during inference to enable programmatic binding. To facilitate systematic evaluation, the authors construct the Text2Opt-Bench benchmark and propose the concept of the “effective binding limit.” Experiments demonstrate that BIND boosts the accuracy of GPT-5-Nano from 59.1% to 82.4%, achieving 95.8% with GPT-5. Moreover, a dedicated binding model with only 1.5B parameters matches the performance of a 7B end-to-end baseline, confirming the efficacy and superiority of the proposed decoupling strategy.

📝 Abstract

Text-to-optimization requires two separable capabilities: modeling -- choosing the right optimization structure -- and binding -- grounding every coefficient, index, and parameter in the concrete problem data. We study this via Text2Opt-Bench, a scalable benchmark of solver-verified optimization problems spanning 12 categories, from textbook linear programs to stochastic and multi-objective formulations with up to thousands of variables. Across 10+ models, we find that accuracy collapses as instance data grows, even when the formulation itself is simple. We call this the effective binding limit. We address this via a simple inference-time approach, BIND, which externalizes numeric data to structured files so the model binds data programmatically rather than transcribing from the prompt. BIND improves GPT-5-Nano from 59.1% to 82.4% accuracy, matching pass@5 (82.0%) at lower token cost than pass@1, and GPT-5 from 86.2% to 95.8%. Furthermore, we validate our hypothesis by finetuning a model exclusively on binding and show that it outperforms end-to-end SFT and RL across three structurally distinct optimization categories, with a 1.5B binding specialist alone matching a 7B end-to-end baseline.

Problem

Research questions and friction points this paper is trying to address.

text-to-optimization

binding

structured grounding

optimization problems

effective binding limit

Innovation

Methods, ideas, or system contributions that make the work stand out.

text-to-optimization

binding

structured grounding