Adaptive Parallel Monte Carlo Tree Search for Efficient Test-time Compute Scaling

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high variability and long-tail latency in Monte Carlo Tree Search (MCTS) during test-time computation expansion, which stems from inefficient search trajectories and limits the effectiveness of existing optimizations when search progress stalls. The authors propose a negative early-exit mechanism that proactively prunes unproductive trajectories and integrates an adaptive boosting strategy to dynamically reallocate the freed computational resources, thereby mitigating resource contention among parallel searches. Implemented within the vLLM inference framework, this approach significantly reduces end-to-end p99 latency and improves system throughput while preserving the accuracy of large language model inference.
📝 Abstract
Monte Carlo Tree Search (MCTS) is an effective test-time compute scaling (TTCS) method for improving the reasoning performance of large language models, but its highly variable execution time leads to severe long-tail latency in practice. Existing optimizations such as positive early exit, reduce latency in favorable cases but are less effective when search continues without meaningful progress. We introduce {\it negative early exit}, which prunes unproductive MCTS trajectories, and an {\it adaptive boosting mechanism} that reallocates reclaimed computation to reduce resource contention among concurrent searches. Integrated into vLLM, these techniques substantially reduce p99 end-to-end latency while improving throughput and maintaining reasoning accuracy.
Problem

Research questions and friction points this paper is trying to address.

Monte Carlo Tree Search
test-time compute scaling
long-tail latency
execution time variability
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

negative early exit
adaptive boosting
Monte Carlo Tree Search
test-time compute scaling
latency optimization
H
Hongbeen Kim
School of Computing, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
Juhyun Lee
Juhyun Lee
University of Texas at Arlington
Cardiac DevelopmentBiomechanicsOptical Imaging
S
Sanghyeon Lee
School of Computing, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
Kwanghoon Choi
Kwanghoon Choi
Chonnam National University, Gwang-Ju, Republic of Korea
Programming LanguagesSoftware Security
Jaehyuk Huh
Jaehyuk Huh
KAIST
Computer ArchitectureOperating SystemsSystem Security