Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling

📅 2026-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the unreliability of large language models (LLMs) in question answering due to hallucinations and factual omissions, particularly when leveraging knowledge graphs (KGs). Existing KG-augmented approaches struggle to generalize to out-of-distribution graph structures. To overcome this limitation, the authors propose Explore-on-Graph, a novel framework that, for the first time, integrates path-refinement-based reward modeling with reinforcement learning to encourage LLMs to autonomously explore diverse and semantically meaningful reasoning paths on KGs—without relying on pre-defined paths or fine-tuning data. The method substantially enhances the model’s generalization capability on unseen graph structures and achieves state-of-the-art performance across five KG question-answering benchmarks, outperforming both open-source and closed-source LLMs.

Technology Category

Application Category

📝 Abstract
The reasoning process of Large Language Models (LLMs) is often plagued by hallucinations and missing facts in question-answering tasks. A promising solution is to ground LLMs' answers in verifiable knowledge sources, such as Knowledge Graphs (KGs). Prevailing KG-enhanced methods typically constrained LLM reasoning either by enforcing rules during generation or by imitating paths from a fixed set of demonstrations. However, they naturally confined the reasoning patterns of LLMs within the scope of prior experience or fine-tuning data, limiting their generalizability to out-of-distribution graph reasoning problems. To tackle this problem, in this paper, we propose Explore-on-Graph (EoG), a novel framework that encourages LLMs to autonomously explore a more diverse reasoning space on KGs. To incentivize exploration and discovery of novel reasoning paths, we propose to introduce reinforcement learning during training, whose reward is the correctness of the reasoning paths' final answers. To enhance the efficiency and meaningfulness of the exploration, we propose to incorporate path information as additional reward signals to refine the exploration process and reduce futile efforts. Extensive experiments on five KGQA benchmark datasets demonstrate that, to the best of our knowledge, our method achieves state-of-the-art performance, outperforming not only open-source but also even closed-source LLMs.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Knowledge Graphs
Hallucinations
Out-of-distribution Reasoning
KGQA
Innovation

Methods, ideas, or system contributions that make the work stand out.

autonomous exploration
reinforcement learning
knowledge graph
path-refined reward
large language models
🔎 Similar Papers
No similar papers found.
S
Shiqi Yan
Zhongguancun Laboratory, Beijing, China; Department of Electronic Engineering, Tsinghua University, Beijing, China
Yubo Chen
Yubo Chen
Institute of Automation, Chinese Academy of Sciences
Natural Language ProcessingInformation ExtractionEvent ExtractionLarge Language Model
R
Ruiqi Zhou
Zhongguancun Laboratory, Beijing, China; Department of Electronic Engineering, Tsinghua University, Beijing, China
Z
Zhengxi Yao
Zhongguancun Laboratory, Beijing, China; Department of Electronic Engineering, Tsinghua University, Beijing, China
S
Shuai Chen
Ant International, Ant Group, Hangzhou, Zhejiang, China
Tianyi Zhang
Tianyi Zhang
PhD Candidate, Zhejiang University & Shanghai AI Laboratory
Computer VisionDeep LearningEmbodied Intelligence
S
Shijie Zhang
Ant International, Ant Group, Hangzhou, Zhejiang, China
W
Wei Qiang Zhang
Zhongguancun Laboratory, Beijing, China; Department of Electronic Engineering, Tsinghua University, Beijing, China
Yongfeng Huang
Yongfeng Huang
Phd Student, Chinese University of Hong Kong
Natural Language Processing
H
Haixin Duan
Zhongguancun Laboratory, Beijing, China; Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China
Y
Yunqi Zhang
Zhongguancun Laboratory, Beijing, China