Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

📅 2025-03-06

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Large language models (LLMs) struggle to effectively scale reasoning-time computation without external feedback signals. Method: This paper proposes an adaptive branching tree search framework that, for the first time, deeply integrates external feedback into the reasoning-time search process. It introduces a dynamic “broad–deep” decision mechanism: at each tree node, it adaptively chooses between expanding new responses (wider) or iteratively refining existing ones (deeper). The framework incorporates diversity-aware response generation, feedback-driven node evaluation, and an enhanced Monte Carlo Tree Search (MCTS) algorithm. Contribution/Results: Experiments demonstrate substantial improvements over repeated sampling and standard MCTS on code generation and engineering tasks. The results validate the critical role of synergistic exploration and exploitation in reasoning-time scaling and establish a novel paradigm for LLM reasoning enhancement under unsupervised feedback conditions.

Technology Category

Application Category

📝 Abstract

Recent advances demonstrate that increasing inference-time computation can significantly boost the reasoning capabilities of large language models (LLMs). Although repeated sampling (i.e., generating multiple candidate outputs) is a highly effective strategy, it does not leverage external feedback signals for refinement, which are often available in tasks like coding. In this work, we propose $ extit{Adaptive Branching Monte Carlo Tree Search (AB-MCTS)}$, a novel inference-time framework that generalizes repeated sampling with principled multi-turn exploration and exploitation. At each node in the search tree, AB-MCTS dynamically decides whether to"go wider"by expanding new candidate responses or"go deeper"by revisiting existing ones based on external feedback signals. We evaluate our method on complex coding and engineering tasks using frontier models. Empirical results show that AB-MCTS consistently outperforms both repeated sampling and standard MCTS, underscoring the importance of combining the response diversity of LLMs with multi-turn solution refinement for effective inference-time scaling.

Problem

Research questions and friction points this paper is trying to address.

Enhance LLM reasoning via adaptive inference-time computation.

Optimize response generation using external feedback signals.

Improve coding tasks with multi-turn exploration and exploitation.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Branching Monte Carlo Tree Search

Dynamic decision for wider or deeper exploration

Leverages external feedback for multi-turn refinement

🔎 Similar Papers

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval