When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Although GraphRAG was proposed to model conceptual hierarchies for enhancing retrieval-augmented generation (RAG), its empirical efficacy remains unclear. Method: We systematically investigate its applicability boundaries by introducing GraphRAG-Bench—the first comprehensive, task-diverse evaluation benchmark for GraphRAG, covering factual retrieval, complex reasoning, contextual summarization, and creative generation. Our framework comprises hierarchical graph construction (via entity-relation extraction and hierarchical clustering), graph-based retrieval (subgraph matching with path-aware re-ranking), and LLM-coordinated generation, supported by a multi-granularity, attribution-aware evaluation protocol. Results: Empirical analysis reveals that GraphRAG significantly outperforms standard RAG only under specific conditions—namely, when tasks involve deep conceptual hierarchies and long-chain reasoning—yielding up to +12.7% accuracy gain. This uncovers critical structural prerequisites for graph-based augmentation. Our contributions include the first reproducible, attribution-grounded GraphRAG benchmark and a practical deployment guideline for real-world adoption.

Technology Category

Application Category

📝 Abstract
Graph retrieval-augmented generation (GraphRAG) has emerged as a powerful paradigm for enhancing large language models (LLMs) with external knowledge. It leverages graphs to model the hierarchical structure between specific concepts, enabling more coherent and effective knowledge retrieval for accurate reasoning.Despite its conceptual promise, recent studies report that GraphRAG frequently underperforms vanilla RAG on many real-world tasks. This raises a critical question: Is GraphRAG really effective, and in which scenarios do graph structures provide measurable benefits for RAG systems? To address this, we propose GraphRAG-Bench, a comprehensive benchmark designed to evaluate GraphRAG models onboth hierarchical knowledge retrieval and deep contextual reasoning. GraphRAG-Bench features a comprehensive dataset with tasks of increasing difficulty, coveringfact retrieval, complex reasoning, contextual summarization, and creative generation, and a systematic evaluation across the entire pipeline, from graph constructionand knowledge retrieval to final generation. Leveraging this novel benchmark, we systematically investigate the conditions when GraphRAG surpasses traditional RAG and the underlying reasons for its success, offering guidelines for its practical application. All related resources and analyses are collected for the community at https://github.com/GraphRAG-Bench/GraphRAG-Benchmark.
Problem

Research questions and friction points this paper is trying to address.

When does GraphRAG outperform traditional RAG methods
What scenarios benefit from graph structures in RAG
How to evaluate GraphRAG's effectiveness in knowledge retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

GraphRAG-Bench benchmark for evaluation
Hierarchical knowledge retrieval enhancement
Systematic pipeline from graph to generation
🔎 Similar Papers
2024-05-26North American Chapter of the Association for Computational LinguisticsCitations: 31