GRAG: Graph Retrieval-Augmented Generation

๐Ÿ“… 2024-05-26
๐Ÿ›๏ธ North American Chapter of the Association for Computational Linguistics
๐Ÿ“ˆ Citations: 31
โœจ Influential: 5
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the limitation of conventional RAG methodsโ€”which retrieve isolated text snippets while ignoring the inherent topological structure of networked documents (e.g., citation graphs, knowledge graphs)โ€”this paper proposes a graph-aware RAG framework. Methodologically: (1) we design a divide-and-conquer linear-time text subgraph retrieval algorithm enabling efficient subgraph-level retrieval; (2) we introduce a dual-path encoder jointly processing text and graph views to explicitly model structural relationships; and (3) we incorporate topology-aware prompt injection and multi-hop graph reasoning fine-tuning. Evaluated on multiple graph reasoning benchmarks, our approach significantly outperforms existing RAG methods, achieving a 21.4% absolute accuracy gain on complex 3+-hop reasoning tasks. To our knowledge, this is the first work to synergistically enhance generative outputs through joint optimization of textual semantics and graph topology.

Technology Category

Application Category

๐Ÿ“ Abstract
Naive Retrieval-Augmented Generation (RAG) focuses on individual documents during retrieval and, as a result, falls short in handling networked documents which are very popular in many applications such as citation graphs, social media, and knowledge graphs. To overcome this limitation, we introduce Graph Retrieval-Augmented Generation (GRAG), which tackles the fundamental challenges in retrieving textual subgraphs and integrating the joint textual and topological information into Large Language Models (LLMs) to enhance its generation. To enable efficient textual subgraph retrieval, we propose a novel divide-and-conquer strategy that retrieves the optimal subgraph structure in linear time. To achieve graph context-aware generation, incorporate textual graphs into LLMs through two complementary views-the text view and the graph view-enabling LLMs to more effectively comprehend and utilize the graph context. Extensive experiments on graph reasoning benchmarks demonstrate that in scenarios requiring multi-hop reasoning on textual graphs, our GRAG approach significantly outperforms current state-of-the-art RAG methods.
Problem

Research questions and friction points this paper is trying to address.

Handling networked documents in retrieval-augmented generation
Retrieving optimal textual subgraphs efficiently
Integrating textual and topological information into LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Retrieval-Augmented Generation (GRAG) for networked documents
Divide-and-conquer strategy for linear-time subgraph retrieval
Dual-view integration (text and graph) into LLMs
๐Ÿ”Ž Similar Papers
No similar papers found.
Yuntong Hu
Yuntong Hu
Emory University
Graph Deep LearningGenerative AIData Mining
Z
Zhihan Lei
Department of Computer Science, Emory University
Zhengwu Zhang
Zhengwu Zhang
University of North Carolina at Chapel Hill
Computational StatisticsMachine LearningBayesImage AnalysisShape Analysis
B
Bo Pan
Department of Computer Science, Emory University
C
Chen Ling
Department of Computer Science, Emory University
L
Liang Zhao
Department of Computer Science, Emory University