MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

To address the challenges of non-trivial intermediate-step quality quantification and high computational overhead in tree-search-based reasoning for large language models (LLMs), this paper proposes MITS—a novel framework for efficient and robust reasoning. MITS introduces pointwise mutual information (PMI) as a step-level evaluation metric, enabling reliable intermediate-state scoring without costly lookahead simulation. It further designs an entropy-driven dynamic beam sampling strategy to adaptively allocate computational resources during search. Finally, it integrates PMI-weighted aggregation with consensus voting for robust final answer selection. Evaluated across multiple reasoning benchmarks, MITS consistently outperforms state-of-the-art baselines, achieving comparable or higher accuracy while reducing inference cost by 30%–50%. These results demonstrate MITS’s effectiveness, computational efficiency, and strong cross-task generalizability.

Technology Category

Application Category

📝 Abstract

Tree search has become as a representative framework for test-time reasoning with large language models (LLMs), exemplified by methods such as Tree-of-Thought and Monte Carlo Tree Search that explore multiple reasoning paths. However, it remains difficult to provide instant and reliable quantitative assessments of intermediate reasoning step quality, and extensive path exploration is computationally costly. To address this, we propose Mutual Information Tree Search (MITS), a novel framework that guides reasoning with information-theoretic principles. MITS introduces an effective scoring function based on pointwise mutual information (PMI), which enables step-wise evaluation of reasoning paths and search tree expansion via beam search without expensive look-ahead simulations, achieving superior reasoning performances while maintaining computational efficiency. The framework is complemented by an entropy-based dynamic sampling strategy that adaptively allocates computational resources to uncertain reasoning steps where exploration is most beneficial. For final prediction, MITS employs a weighted voting scheme that combines PMI scores with prediction consensus. Through comprehensive experiments on diverse reasoning benchmarks, MITS consistently surpasses baseline methods, establishing a principled and efficient framework for LLM reasoning.

Problem

Research questions and friction points this paper is trying to address.

Enhancing tree search reasoning in LLMs via pointwise mutual information

Providing instant quantitative assessment of intermediate reasoning steps

Reducing computational costs of extensive path exploration

Innovation

Methods, ideas, or system contributions that make the work stand out.

PMI-based scoring enables stepwise path evaluation

Entropy-based sampling allocates computational resources adaptively

Weighted voting combines PMI scores with prediction consensus

🔎 Similar Papers

Interpretable Contrastive Monte Carlo Tree Search Reasoning