RGL: A Graph-Centric, Modular Framework for Efficient Retrieval-Augmented Generation on Graphs

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing graph-structured RAG systems suffer from architectural rigidity, high engineering overhead, inefficient subgraph retrieval, and poor scalability, failing to leverage mature subgraph pattern matching theory from graph database research. This paper proposes the first graph-centric, modular RAG unification framework. It introduces a graph index and dynamic subgraph retrieval mechanism enabling node-level adaptive filtering and on-the-fly subgraph construction; integrates graph-aware tokenization, multi-format graph data adaptation, and a lightweight generation interface; and significantly reduces token consumption via subgraph extraction and compression. Experiments demonstrate that the framework achieves up to 143× speedup on canonical graph reasoning and question-answering tasks, with substantial improvements in generation accuracy and response latency, while drastically shortening RAG prototype development cycles.

Technology Category

Application Category

📝 Abstract
Recent advances in graph learning have paved the way for innovative retrieval-augmented generation (RAG) systems that leverage the inherent relational structures in graph data. However, many existing approaches suffer from rigid, fixed settings and significant engineering overhead, limiting their adaptability and scalability. Additionally, the RAG community has largely overlooked the decades of research in the graph database community regarding the efficient retrieval of interesting substructures on large-scale graphs. In this work, we introduce the RAG-on-Graphs Library (RGL), a modular framework that seamlessly integrates the complete RAG pipeline-from efficient graph indexing and dynamic node retrieval to subgraph construction, tokenization, and final generation-into a unified system. RGL addresses key challenges by supporting a variety of graph formats and integrating optimized implementations for essential components, achieving speedups of up to 143x compared to conventional methods. Moreover, its flexible utilities, such as dynamic node filtering, allow for rapid extraction of pertinent subgraphs while reducing token consumption. Our extensive evaluations demonstrate that RGL not only accelerates the prototyping process but also enhances the performance and applicability of graph-based RAG systems across a range of tasks.
Problem

Research questions and friction points this paper is trying to address.

Existing RAG systems lack adaptability and scalability
Graph database research is underutilized in RAG systems
Need for efficient graph indexing and dynamic retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular framework for graph-based RAG systems
Efficient graph indexing and dynamic node retrieval
Flexible utilities for rapid subgraph extraction
🔎 Similar Papers
No similar papers found.
Y
Yuan Li
National University of Singapore, Singapore
J
Jun Hu
National University of Singapore, Singapore
J
Jiaxin Jiang
National University of Singapore, Singapore
Zemin Liu
Zemin Liu
Zhejiang University
Graph LearningGraph Imbalanced Learning
Bryan Hooi
Bryan Hooi
National University of Singapore
Machine LearningNatural Language ProcessingGraphsTrustworthy AI
B
Bingsheng He
National University of Singapore, Singapore