Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses the challenges of using large language models (LLMs) to generate solvers for combinatorial optimization problems, where directly optimizing search strategies often introduces errors or degrades performance. The authors construct CP-SynC-XL, a benchmark comprising 100 problem types and 4,577 instances, to systematically evaluate three modeling paradigms: native Python, Python with OR-Tools, and MiniZinc with OR-Tools. Their analysis reveals a “heuristic trap”: compelling LLMs to generate optimized search logic yields only marginal speedups (1.03–1.12×) while significantly compromising correctness. Among the paradigms, Python+OR-Tools demonstrates superior performance. Based on these findings, the study proposes a “reformulate carefully, optimize sparingly” principle, advocating that LLMs should focus on formalizing variables, constraints, and objectives rather than synthesizing search strategies, thereby enhancing solver reliability.

📝 Abstract

Large Language Models (LLMs) struggle to solve complex combinatorial problems through direct reasoning, so recent neuro-symbolic systems increasingly use them to synthesize executable solvers. A central design question is how the LLM should represent the solver, and whether it should also attempt to optimize search. We introduce CP-SynC-XL, a benchmark of 100 combinatorial problems (4,577 instances), and evaluate three solver-construction paradigms: native algorithmic search (Python), constraint modeling through a Python solver API (Python + OR-Tools), and declarative constraint modeling (MiniZinc + OR-Tools). We find a consistent representational divergence: Python + OR-Tools attains the highest correctness across LLMs, while MiniZinc + OR-Tools has lower absolute coverage despite using the same OR-Tools back-end. Native Python is the most likely to return a schema-valid solution that fails verification, whereas solver-backed paths preserve higher conditional fidelity. On the heuristic axis, prompting for search optimization yields only small median speed-ups (1.03-1.12x) and a strongly bimodal effect: many instances slow down, and correctness drops sharply on a long tail of problems. A paired code-level audit traces these regressions to a recurring heuristic trap. Under an efficiency-oriented prompt, the LLM may replace complete search with local approximations (Python), inject unverified bounds (Python + OR-Tools), or add redundant declarative machinery that overwhelms or over-constrains the model (MiniZinc + OR-Tools). These findings support a conservative design principle for LLM-generated combinatorial solvers: use the LLM primarily to formalize variables, constraints, and objectives for verified solvers, and separately check any LLM-authored search optimization before use.

Problem

Research questions and friction points this paper is trying to address.

combinatorial problems

large language models

solver synthesis

heuristic optimization

constraint modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

neuro-symbolic

combinatorial solvers

heuristic trap