Programming over Thinking: Efficient and Robust Multi-Constraint Planning

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

This work addresses the inefficiency and limited generalization of current large language models in multi-constraint planning, which often stem from inconsistent reasoning, error propagation, or overreliance on fixed solvers. To overcome these limitations, the authors propose SCOPE, a novel framework that decouples query-specific reasoning from general-purpose code execution by generating reusable, deterministic solver functions. These functions can be efficiently adapted across tasks by merely adjusting input parameters, thereby satisfying diverse constraints without regenerating task-specific code. SCOPE integrates large language models with programmatic execution through a modular code generation strategy, avoiding redundant code synthesis. Evaluated on the TravelPlanner benchmark, SCOPE achieves a 93.1% success rate using GPT-4o, outperforming the best baseline by 61.6% while reducing inference cost by 1.4× and latency by 4.67×.

Technology Category

Application Category

📝 Abstract

Multi-constraint planning involves identifying, evaluating, and refining candidate plans while satisfying multiple, potentially conflicting constraints. Existing large language model (LLM) approaches face fundamental limitations in this domain. Pure reasoning paradigms, which rely on long natural language chains, are prone to inconsistency, error accumulation, and prohibitive cost as constraints compound. Conversely, LLMs combined with coding- or solver-based strategies lack flexibility: they often generate problem-specific code from scratch or depend on fixed solvers, failing to capture generalizable logic across diverse problems. To address these challenges, we introduce the Scalable COde Planning Engine (SCOPE), a framework that disentangles query-specific reasoning from generic code execution. By separating reasoning from execution, SCOPE produces solver functions that are consistent, deterministic, and reusable across queries while requiring only minimal changes to input parameters. SCOPE achieves state-of-the-art performance while lowering cost and latency. For example, with GPT-4o, it reaches 93.1% success on TravelPlanner, a 61.6% gain over the best baseline (CoT) while cutting inference cost by 1.4x and time by ~4.67x. Code is available at https://github.com/DerrickGXD/SCOPE.

Problem

Research questions and friction points this paper is trying to address.

multi-constraint planning

large language models

reasoning

code generation

constraint satisfaction

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-constraint planning

code-reasoning disentanglement

reusable solver functions