Adaptive Parallel Monte Carlo Tree Search for Efficient Test-time Compute Scaling

📅 2026-04-01

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the high variability and long-tail latency in Monte Carlo Tree Search (MCTS) during test-time computation expansion, which stems from inefficient search trajectories and limits the effectiveness of existing optimizations when search progress stalls. The authors propose a negative early-exit mechanism that proactively prunes unproductive trajectories and integrates an adaptive boosting strategy to dynamically reallocate the freed computational resources, thereby mitigating resource contention among parallel searches. Implemented within the vLLM inference framework, this approach significantly reduces end-to-end p99 latency and improves system throughput while preserving the accuracy of large language model inference.

Technology Category

Application Category

📝 Abstract

Monte Carlo Tree Search (MCTS) is an effective test-time compute scaling (TTCS) method for improving the reasoning performance of large language models, but its highly variable execution time leads to severe long-tail latency in practice. Existing optimizations such as positive early exit, reduce latency in favorable cases but are less effective when search continues without meaningful progress. We introduce {\it negative early exit}, which prunes unproductive MCTS trajectories, and an {\it adaptive boosting mechanism} that reallocates reclaimed computation to reduce resource contention among concurrent searches. Integrated into vLLM, these techniques substantially reduce p99 end-to-end latency while improving throughput and maintaining reasoning accuracy.

Problem

Research questions and friction points this paper is trying to address.

Monte Carlo Tree Search

test-time compute scaling

long-tail latency

execution time variability

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

negative early exit

adaptive boosting

Monte Carlo Tree Search