Online Prompt and Solver Selection for Program Synthesis

πŸ“… 2025-01-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Low efficiency and high cost in coordinating large language models (LLMs) with symbolic solvers for program synthesis. Method: This paper pioneers modeling solver selection (between LLMs and mathematical solvers) and LLM prompting strategy optimization as a multi-armed bandit (MAB) online learning problem, proposing an adaptive decision framework anchored to a virtual optimal solution benchmark. The framework jointly optimizes LLM invocations, integrates symbolic solvers, and employs a reward-driven real-time decision mechanism to dynamically balance success rate, latency, and invocation cost. Contribution/Results: Evaluated on diverse program synthesis benchmarks, the approach improves query resolution rate by 37.2% over the best single-solver baseline and achieves 96% of the virtual optimal solution’s performance, significantly enhancing dynamic orchestration of heterogeneous solver resources.

Technology Category

Application Category

πŸ“ Abstract
Large Language Models (LLMs) demonstrate impressive capabilities in the domain of program synthesis. This level of performance is not, however, universal across all tasks, all LLMs and all prompting styles. There are many areas where one LLM dominates, one prompting style dominates, or where calling a symbolic solver is a better choice than an LLM. A key challenge for the user then, is to identify not only when an LLM is the right choice of solver, and the appropriate LLM to call for a given synthesis task, but also the right way to call it. A non-expert user who makes the wrong choice, incurs a cost both in terms of results (number of tasks solved, and the time it takes to solve them) and financial cost, if using a closed-source language model via a commercial API. We frame this choice as an online learning problem. We use a multi-armed bandit algorithm to select which symbolic solver, or LLM and prompt combination to deploy in order to maximize a given reward function (which may prioritize solving time, number of synthesis tasks solved, or financial cost of solving). We implement an instance of this approach, called CYANEA, and evaluate it on synthesis queries from the literature in ranking function synthesis, from the syntax-guided synthesis competition, and fresh, unseen queries generated from SMT problems. CYANEA solves 37.2% more queries than the best single solver and achieves results within 4% of the virtual best solver.
Problem

Research questions and friction points this paper is trying to address.

Programming Tasks
Large Language Models (LLMs)
Efficiency and Cost
Innovation

Methods, ideas, or system contributions that make the work stand out.

CYANEA System
Large Language Models
Task Efficiency
πŸ”Ž Similar Papers
No similar papers found.
Y
Yixuan Li
University of Edinburgh, UK
L
Lewis Frampton
University of Edinburgh, UK
F
Federico Mora
University of California, Berkeley, USA
Elizabeth Polgreen
Elizabeth Polgreen
Lecturer, University of Edinburgh
automated verification