GRAG: Graph Retrieval-Augmented Generation

📅 2024-05-26

🏛️ North American Chapter of the Association for Computational Linguistics

📈 Citations: 31

✨ Influential: 5

career value

170K/year

🤖 AI Summary

To address the limitation of conventional RAG methods—which retrieve isolated text snippets while ignoring the inherent topological structure of networked documents (e.g., citation graphs, knowledge graphs)—this paper proposes a graph-aware RAG framework. Methodologically: (1) we design a divide-and-conquer linear-time text subgraph retrieval algorithm enabling efficient subgraph-level retrieval; (2) we introduce a dual-path encoder jointly processing text and graph views to explicitly model structural relationships; and (3) we incorporate topology-aware prompt injection and multi-hop graph reasoning fine-tuning. Evaluated on multiple graph reasoning benchmarks, our approach significantly outperforms existing RAG methods, achieving a 21.4% absolute accuracy gain on complex 3+-hop reasoning tasks. To our knowledge, this is the first work to synergistically enhance generative outputs through joint optimization of textual semantics and graph topology.

Technology Category

Application Category

📝 Abstract

Naive Retrieval-Augmented Generation (RAG) focuses on individual documents during retrieval and, as a result, falls short in handling networked documents which are very popular in many applications such as citation graphs, social media, and knowledge graphs. To overcome this limitation, we introduce Graph Retrieval-Augmented Generation (GRAG), which tackles the fundamental challenges in retrieving textual subgraphs and integrating the joint textual and topological information into Large Language Models (LLMs) to enhance its generation. To enable efficient textual subgraph retrieval, we propose a novel divide-and-conquer strategy that retrieves the optimal subgraph structure in linear time. To achieve graph context-aware generation, incorporate textual graphs into LLMs through two complementary views-the text view and the graph view-enabling LLMs to more effectively comprehend and utilize the graph context. Extensive experiments on graph reasoning benchmarks demonstrate that in scenarios requiring multi-hop reasoning on textual graphs, our GRAG approach significantly outperforms current state-of-the-art RAG methods.

Problem

Research questions and friction points this paper is trying to address.

Handling networked documents in retrieval-augmented generation

Retrieving optimal textual subgraphs efficiently

Integrating textual and topological information into LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Retrieval-Augmented Generation (GRAG) for networked documents

Divide-and-conquer strategy for linear-time subgraph retrieval

Dual-view integration (text and graph) into LLMs

🔎 Similar Papers

No similar papers found.