Enhancing Large Language Models with Reward-guided Tree Search for Knowledge Graph Question and Answering

📅 2025-05-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address inaccurate reasoning path retrieval and underutilization of historical paths in knowledge graph question answering (KGQA), this paper proposes Reward-guided Tree Search on Graph (RTSoG), a training-free framework. Methodologically, RTSoG introduces a novel synergy between subproblem decomposition and Self-Critic Monte Carlo Tree Search (SC-MCTS), enabling weighted reuse of historical reasoning paths and reward-guided dynamic search over the knowledge graph. It incorporates lightweight reward modeling and a weighted path stacking generation strategy to enhance path discovery quality for semantically complex questions. Evaluated on GrailQA and WebQSP benchmarks, RTSoG achieves absolute improvements of 8.7% and 7.0% over prior state-of-the-art methods, respectively. These results demonstrate substantial gains in both accuracy and robustness for KGQA systems.

Technology Category

Application Category

📝 Abstract
Recently, large language models (LLMs) have demonstrated impressive performance in Knowledge Graph Question Answering (KGQA) tasks, which aim to find answers based on knowledge graphs (KGs) for natural language questions. Existing LLMs-based KGQA methods typically follow the Graph Retrieval-Augmented Generation (GraphRAG) paradigm, which first retrieves reasoning paths from the large KGs, and then generates the answers based on them. However, these methods emphasize the exploration of new optimal reasoning paths in KGs while ignoring the exploitation of historical reasoning paths, which may lead to sub-optimal reasoning paths. Additionally, the complex semantics contained in questions may lead to the retrieval of inaccurate reasoning paths. To address these issues, this paper proposes a novel and training-free framework for KGQA tasks called Reward-guided Tree Search on Graph (RTSoG). RTSoG decomposes an original question into a series of simpler and well-defined sub-questions to handle the complex semantics. Then, a Self-Critic Monte Carlo Tree Search (SC-MCTS) guided by a reward model is introduced to iteratively retrieve weighted reasoning paths as contextual knowledge. Finally, it stacks the weighted reasoning paths according to their weights to generate the final answers. Extensive experiments on four datasets demonstrate the effectiveness of RTSoG. Notably, it achieves 8.7% and 7.0% performance improvement over the state-of-the-art method on the GrailQA and the WebQSP respectively.
Problem

Research questions and friction points this paper is trying to address.

Optimizing reasoning paths in KGQA using historical data
Handling complex semantics via question decomposition
Improving accuracy with reward-guided tree search
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposes questions into simpler sub-questions
Uses reward-guided Self-Critic Monte Carlo Tree Search
Stacks weighted reasoning paths for final answers
🔎 Similar Papers
No similar papers found.
Xiao Long
Xiao Long
University of Science and Technology of China | Alibaba Group
Knowledge GraphLarge Language ModelReasoning
Liansheng Zhuang
Liansheng Zhuang
Univerisity of Science and Technology of China
Computer VisionKnowledge GraphComputer Games
C
Chen Shen
independent researchers
Shaotian Yan
Shaotian Yan
Alibaba Group
Machine LearningComputer VisionLarge Language Models
Y
Yifei Li
School of Cyberspace Science and Technology, University of Science and Technology of China (USTC), Hefei, Anhui 230026, China
S
Shafei Wang
Peng Cheng Laboratory, Shenzhen 518066, China