MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

📅 2025-03-26

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

To address the high hallucination rates, low factual accuracy, and poor response consistency of small language models (SLMs) on knowledge-intensive tasks, this paper proposes MCTS-RAG—a novel reasoning framework that deeply integrates Monte Carlo Tree Search (MCTS) with Retrieval-Augmented Generation (RAG). Unlike prior approaches, MCTS-RAG enables dynamic co-optimization of retrieval context and reasoning paths at every MCTS node. It introduces a confidence-driven iterative re-ranking and node expansion mechanism, endowing the inference process with adaptivity and explicit fact-awareness. Evaluated on ComplexWebQA, GPQA, and FoolMeTwice benchmarks, MCTS-RAG enables SLMs to match the performance of GPT-4o while substantially reducing hallucinations and significantly improving both factual accuracy and response consistency. This work marks the first framework to achieve tight, joint optimization of retrieval and generation within an MCTS-based reasoning paradigm.

Technology Category

Application Category

📝 Abstract

We introduce MCTS-RAG, a novel approach that enhances the reasoning capabilities of small language models on knowledge-intensive tasks by leveraging retrieval-augmented generation (RAG) to provide relevant context and Monte Carlo Tree Search (MCTS) to refine reasoning paths. MCTS-RAG dynamically integrates retrieval and reasoning through an iterative decision-making process. Unlike standard RAG methods, which typically retrieve information independently from reasoning and thus integrate knowledge suboptimally, or conventional MCTS reasoning, which depends solely on internal model knowledge without external facts, MCTS-RAG combines structured reasoning with adaptive retrieval. This integrated approach enhances decision-making, reduces hallucinations, and ensures improved factual accuracy and response consistency. The experimental results on multiple reasoning and knowledge-intensive datasets datasets (i.e., ComplexWebQA, GPQA, and FoolMeTwice) show that our method enables small-scale LMs to achieve performance comparable to frontier LLMs like GPT-4o by effectively scaling inference-time compute, setting a new standard for reasoning in small-scale models.

Problem

Research questions and friction points this paper is trying to address.

Enhancing reasoning in small language models using RAG and MCTS

Improving factual accuracy by integrating retrieval with structured reasoning

Enabling small models to match large models' performance on knowledge tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines RAG with Monte Carlo Tree Search

Dynamically integrates retrieval and reasoning

Enhances decision-making and reduces hallucinations

🔎 Similar Papers

No similar papers found.