PaT: Planning-after-Trial for Efficient Test-Time Code Generation

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

This work addresses the inefficiency of prevailing test-time compute methods, which typically employ a planning-first strategy that incurs redundant overhead on problems solvable without explicit planning. To overcome this limitation, we propose Planning-after-Truth (PaT), a novel verification-driven adaptive scheduling mechanism that invokes a heavyweight planning model only when outputs from a lightweight generative model fail verification. By integrating heterogeneous language model collaboration with test-time compute scaling, PaT substantially advances the cost-performance Pareto frontier across multiple benchmarks, achieving performance comparable to large homogeneous models at approximately 69% lower inference cost.

📝 Abstract

Beyond training-time optimization, scaling test-time computation has emerged as a key paradigm to extend the reasoning capabilities of Large Language Models (LLMs). However, most existing methods adopt a rigid Planning-before-Trial (PbT) policy, which inefficiently allocates test-time compute by incurring planning overhead even on directly solvable problems. We propose Planning-after-Trial (PaT), an adaptive policy for code generation that invokes a planner only upon verification failure. This adaptive policy naturally enables a heterogeneous model configuration: a cost-efficient model handles generation attempts, while a powerful model is reserved for targeted planning interventions. Empirically, across multiple benchmarks and model families, our approach significantly advances the cost-performance Pareto frontier. Notably, our heterogeneous configuration achieves performance comparable to a large homogeneous model while reducing inference cost by approximately 69\%.

Problem

Research questions and friction points this paper is trying to address.

test-time computation

code generation

planning efficiency

Large Language Models

cost-performance trade-off

Innovation

Methods, ideas, or system contributions that make the work stand out.

Planning-after-Trial

test-time computation

heterogeneous model configuration