GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models

📅 2025-09-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing graph-structured RAG approaches suffer from three key limitations: decoupled retrieval and reasoning, poor scalability to multi-hop queries, and heavy reliance on annotated entities. This paper proposes a joint learning framework that unifies knowledge graph (KG) retrieval with large language models (LLMs) via end-to-end training, enabling adaptive multi-hop retrieval and reasoning co-optimization. Our core contributions are: (1) an attention-driven growth-and-pruning mechanism guided by LLM logits—providing implicit feedback without requiring ground-truth entity annotations—enabling open-domain, dynamic subgraph construction; and (2) a soft-token encoding scheme that structurally embeds graph information into the LLM, seamlessly integrating multi-hop retrieval, subgraph refinement, and joint backpropagation. Evaluated on three QA benchmarks, our method achieves state-of-the-art performance, with significant gains in complex multi-hop reasoning accuracy, demonstrating both the effectiveness and generalizability of the joint optimization paradigm.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) has significantly mitigated the hallucinations of Large Language Models (LLMs) by grounding the generation with external knowledge. Recent extensions of RAG to graph-based retrieval offer a promising direction, leveraging the structural knowledge for multi-hop reasoning. However, existing graph RAG typically decouples retrieval and reasoning processes, which prevents the retriever from adapting to the reasoning needs of the LLM. They also struggle with scalability when performing multi-hop expansion over large-scale graphs, or depend heavily on annotated ground-truth entities, which are often unavailable in open-domain settings. To address these challenges, we propose a novel graph retriever trained end-to-end with LLM, which features an attention-based growing and pruning mechanism, adaptively navigating multi-hop relevant entities while filtering out noise. Within the extracted subgraph, structural knowledge and semantic features are encoded via soft tokens and the verbalized graph, respectively, which are infused into the LLM together, thereby enhancing its reasoning capability and facilitating interactive joint training of the graph retriever and the LLM reasoner. Experimental results across three QA benchmarks show that our approach consistently achieves state-of-the-art performance, validating the strength of joint graph-LLM optimization for complex reasoning tasks. Notably, our framework eliminates the need for predefined ground-truth entities by directly optimizing the retriever using LLM logits as implicit feedback, making it especially effective in open-domain settings.
Problem

Research questions and friction points this paper is trying to address.

Existing graph RAG decouples retrieval from LLM reasoning needs
Graph RAG struggles with scalability on large multi-hop graphs
Current approaches depend heavily on annotated ground-truth entities
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end trained graph retriever with LLM attention
Attention-based growing and pruning mechanism for entities
Soft tokens and verbalized graphs enhance reasoning capability
🔎 Similar Papers
No similar papers found.
Jialin Chen
Jialin Chen
Yale University
Foundation ModelsGraph LearningMultimodal RAG
H
Houyu Zhang
Amazon
Seongjun Yun
Seongjun Yun
Amazon
A
Alejandro Mottini
Amazon
R
Rex Ying
Yale University
X
Xiang Song
Amazon
V
Vassilis N. Ioannidis
Amazon
Z
Zheng Li
Amazon
Q
Qingjun Cui
Amazon