Think Anywhere in Code Generation

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of conventional large language models in code generation, which rely on upfront reasoning and struggle to adapt dynamically to complexity that emerges incrementally during implementation. The authors propose Think-Anywhere, a novel mechanism that enables on-demand reasoning at any point during code generation. By integrating cold-start imitation learning with outcome-based reinforcement learning, the approach achieves adaptive reasoning scheduling, overcoming the constraints of fixed pre-generation reasoning. This significantly enhances the model’s responsiveness and interpretability at high-entropy positions in the code. Empirical evaluations demonstrate state-of-the-art performance across four major benchmarks—LeetCode, LiveCodeBench, HumanEval, and MBPP—and consistent improvements across diverse large language models, underscoring its strong generalization capability.
📝 Abstract
Recent advances in reasoning Large Language Models (LLMs) have primarily relied on upfront thinking, where reasoning occurs before final answer. However, this approach suffers from critical limitations in code generation, where upfront thinking is often insufficient as problems' full complexity only reveals itself during code implementation. Moreover, it cannot adaptively allocate reasoning effort throughout the code generation process where difficulty varies significantly. In this paper, we propose Think-Anywhere, a novel reasoning mechanism that enables LLMs to invoke thinking on-demand at any token position during code generation. We achieve Think-Anywhere by first teaching LLMs to imitate the reasoning patterns through cold-start training, then leveraging outcome-based RL rewards to drive the model's autonomous exploration of when and where to invoke reasoning. Extensive experiments on four mainstream code generation benchmarks (i.e., LeetCode, LiveCodeBench, HumanEval, and MBPP) show that Think-Anywhere achieves state-of-the-art performance over both existing reasoning methods and recent post-training approaches, while demonstrating consistent generalization across diverse LLMs. Our analysis further reveals that Think-Anywhere enables the model to adaptively invoke reasoning at high-entropy positions, providing enhanced interpretability.
Problem

Research questions and friction points this paper is trying to address.

code generation
reasoning
Large Language Models
adaptive reasoning
upfront thinking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Think-Anywhere
on-demand reasoning
code generation
reinforcement learning
adaptive reasoning
🔎 Similar Papers
No similar papers found.