Distribution-Aware Algorithm Design with LLM Agents

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the challenge of learning executable solver code from a task distribution while jointly optimizing solution quality and runtime efficiency, rather than merely ensuring correctness. The authors propose a novel abstraction called “solver prompts,” wherein large language model agents infer reusable prompts from a polynomial number of distributional samples and compile them into specialized solver code. This approach achieves joint generalization in both correctness and execution time. Evaluated across 21 combinatorial optimization distributions, the synthesized solvers attain an average solution quality of 0.971, outperforming the best hand-crafted heuristics by 336.9× and Gurobi by 342.8× in speed. Furthermore, on the PACE 2025 private instances, all generated solvers remain valid and achieve a two-order-of-magnitude speedup.

📝 Abstract

We study learning when the learned object is executable solver code rather than a predictor. In this setting, correctness is not enough: two solvers may both return valid solutions on the deployment distribution while differing substantially in runtime. Given samples from an unknown task distribution, the learner returns code evaluated on fresh instances by both solution quality and execution time. Our central abstraction is a \emph{solver hint}: reusable structure inferred from samples and compiled into specialized solver code. We prove that the empirically fastest sample-consistent solver from a fixed library generalizes in both correctness and runtime, and that statistically identifiable hints can be recovered and compiled from polynomially many samples. Empirically, we instantiate the framework with LLM code agents on $21$ structured combinatorial-optimization target distributions across seven problem classes. The synthesized solvers reach mean normalized quality $0.971$, improve by $+0.224$ over the average heuristic pool and by $+0.098$ over the highest-quality heuristic, and are $336.9\times$, $342.8\times$, and $16.1\times$ faster than the quality-best heuristic, Gurobi, and the selected time-limited exact backend, respectively. On released PACE 2025 Dominating Set private instances, the synthesized solver is valid on all $100$ graphs and runs about two orders of magnitude faster than top competition solvers, with a moderate quality gap. Inspection shows that many gains come from changing the computational scale: replacing ambient exponential search or general-purpose optimization with compiled distribution-specific computation.

Problem

Research questions and friction points this paper is trying to address.

solver code

runtime efficiency

distribution-aware learning

combinatorial optimization

algorithm design

Innovation

Methods, ideas, or system contributions that make the work stand out.

distribution-aware algorithm design

solver hint

LLM code agents