RGL: A Graph-Centric, Modular Framework for Efficient Retrieval-Augmented Generation on Graphs

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Existing graph-structured RAG systems suffer from architectural rigidity, high engineering overhead, inefficient subgraph retrieval, and poor scalability, failing to leverage mature subgraph pattern matching theory from graph database research. This paper proposes the first graph-centric, modular RAG unification framework. It introduces a graph index and dynamic subgraph retrieval mechanism enabling node-level adaptive filtering and on-the-fly subgraph construction; integrates graph-aware tokenization, multi-format graph data adaptation, and a lightweight generation interface; and significantly reduces token consumption via subgraph extraction and compression. Experiments demonstrate that the framework achieves up to 143× speedup on canonical graph reasoning and question-answering tasks, with substantial improvements in generation accuracy and response latency, while drastically shortening RAG prototype development cycles.

Technology Category

Application Category

📝 Abstract

Recent advances in graph learning have paved the way for innovative retrieval-augmented generation (RAG) systems that leverage the inherent relational structures in graph data. However, many existing approaches suffer from rigid, fixed settings and significant engineering overhead, limiting their adaptability and scalability. Additionally, the RAG community has largely overlooked the decades of research in the graph database community regarding the efficient retrieval of interesting substructures on large-scale graphs. In this work, we introduce the RAG-on-Graphs Library (RGL), a modular framework that seamlessly integrates the complete RAG pipeline-from efficient graph indexing and dynamic node retrieval to subgraph construction, tokenization, and final generation-into a unified system. RGL addresses key challenges by supporting a variety of graph formats and integrating optimized implementations for essential components, achieving speedups of up to 143x compared to conventional methods. Moreover, its flexible utilities, such as dynamic node filtering, allow for rapid extraction of pertinent subgraphs while reducing token consumption. Our extensive evaluations demonstrate that RGL not only accelerates the prototyping process but also enhances the performance and applicability of graph-based RAG systems across a range of tasks.

Problem

Research questions and friction points this paper is trying to address.

Existing RAG systems lack adaptability and scalability

Graph database research is underutilized in RAG systems

Need for efficient graph indexing and dynamic retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular framework for graph-based RAG systems

Efficient graph indexing and dynamic node retrieval

Flexible utilities for rapid subgraph extraction

🔎 Similar Papers

GRAG: Graph Retrieval-Augmented Generation