GRAIL:Learning to Interact with Large Knowledge Graphs for Retrieval Augmented Reasoning

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing RAG methods exhibit weak retrieval capability over structured knowledge—particularly knowledge graphs (KGs)—due to insufficient joint modeling of graph topology and precision control, often resulting in information loss or redundancy that degrades reasoning performance. To address this, we propose GRAIL, the first framework to deeply integrate large language models (LLMs) into dynamic KG exploration and precise retrieval. GRAIL employs LLM-guided stochastic path exploration to generate candidate reasoning paths, decouples accuracy and conciseness objectives via path filtering and a two-stage policy learning mechanism, and further refines retrieval decisions through process-supervised reinforcement learning. This enables fine-grained, interpretable reasoning trajectory construction. Evaluated on three KGQA benchmarks, GRAIL achieves average improvements of +21.01% in accuracy and +22.43% in F1-score over state-of-the-art RAG and graph-based retrieval methods.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) integrated with Retrieval-Augmented Generation (RAG) techniques have exhibited remarkable performance across a wide range of domains. However, existing RAG approaches primarily operate on unstructured data and demonstrate limited capability in handling structured knowledge such as knowledge graphs. Meanwhile, current graph retrieval methods fundamentally struggle to capture holistic graph structures while simultaneously facing precision control challenges that manifest as either critical information gaps or excessive redundant connections, collectively undermining reasoning performance. To address this challenge, we propose GRAIL: Graph-Retrieval Augmented Interactive Learning, a framework designed to interact with large-scale graphs for retrieval-augmented reasoning. Specifically, GRAIL integrates LLM-guided random exploration with path filtering to establish a data synthesis pipeline, where a fine-grained reasoning trajectory is automatically generated for each task. Based on the synthesized data, we then employ a two-stage training process to learn a policy that dynamically decides the optimal actions at each reasoning step. The overall objective of precision-conciseness balance in graph retrieval is decoupled into fine-grained process-supervised rewards to enhance data efficiency and training stability. In practical deployment, GRAIL adopts an interactive retrieval paradigm, enabling the model to autonomously explore graph paths while dynamically balancing retrieval breadth and precision. Extensive experiments have shown that GRAIL achieves an average accuracy improvement of 21.01% and F1 improvement of 22.43% on three knowledge graph question-answering datasets. Our source code and datasets is available at https://github.com/Changgeww/GRAIL.
Problem

Research questions and friction points this paper is trying to address.

Handling structured knowledge in retrieval-augmented reasoning
Balancing precision and conciseness in graph retrieval
Improving reasoning performance with large knowledge graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided random exploration with path filtering
Two-stage training for dynamic action policy
Interactive retrieval balancing breadth and precision
🔎 Similar Papers
No similar papers found.
G
Ge Chang
Institute for AI Industry Research (AIR), Tsinghua University
J
Jinbo Su
Institute for AI Industry Research (AIR), Tsinghua University
J
Jiacheng Liu
Institute for AI Industry Research (AIR), Tsinghua University
Pengfei Yang
Pengfei Yang
Institute of Software, Chinese Academy of Sciences
Probabilistic model checkingDNN verification
Y
Yuhao Shang
Institute for AI Industry Research (AIR), Tsinghua University
H
Huiwen Zheng
GDS Holdings Limited
H
Hongli Ma
GDS Holdings Limited
Yan Liang
Yan Liang
Northwestern Polytechnical University
Information fusionState EstimationTarget tracking
Yuanchun Li
Yuanchun Li
Institute for AI Industry Research (AIR), Tsinghua University
mobile computingartificial intelligence
Yunxin Liu
Yunxin Liu
IEEE Fellow, Guoqiang Professor, Institute for AI Industry Research (AIR), Tsinghua University
Mobile ComputingEdge ComputingAIoTSystemNetworking