Aligning Tree-Search Policies with Fixed Token Budgets in Test-Time Scaling of LLMs

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses a critical limitation in existing tree search decoding methods for large language model inference: their disregard for fixed token budget constraints, which often leads to excessive branching in later stages or premature termination. To overcome this, the authors propose Budget-Guided Monte Carlo Tree Search (BG-MCTS), the first approach to explicitly integrate token budgets into the tree search strategy. BG-MCTS dynamically balances exploration and exploitation—favoring broad exploration early on and progressively focusing on answer refinement while suppressing late-stage branching at shallow nodes as the budget depletes. By incorporating dynamic branch control and priority-based scheduling, BG-MCTS consistently outperforms existing budget-agnostic tree search methods across varying token budgets on the MATH500 and AIME24/25 benchmarks using open-source large language models.

Technology Category

Application Category

📝 Abstract

Tree-search decoding is an effective form of test-time scaling for large language models (LLMs), but real-world deployment imposes a fixed per-query token budget that varies across settings. Existing tree-search policies are largely budget-agnostic, treating the budget as a termination condition, which can lead to late-stage over-branching or premature termination. We propose {Budget-Guided MCTS} (BG-MCTS), a tree-search decoding algorithm that aligns its search policy with the remaining token budget: it starts with broad exploration, then prioritizes refinement and answer completion as the budget depletes while reducing late-stage branching from shallow nodes. BG-MCTS consistently outperforms budget-agnostic tree-search baselines across different budgets on MATH500 and AIME24/25 with open-weight LLMs.

Problem

Research questions and friction points this paper is trying to address.

tree-search decoding

token budget

test-time scaling

large language models

budget-awareness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Budget-Guided MCTS

tree-search decoding

token budget

test-time scaling

large language models

🔎 Similar Papers

Large Vocabulary Size Improves Large Language Models

2024-06-24arXiv.orgCitations: 3

Authors to Follow