LightRAG: Simple and Fast Retrieval-Augmented Generation

📅 2024-10-08

🏛️ arXiv.org

📈 Citations: 11

✨ Influential: 2

career value

169K/year

🤖 AI Summary

Existing RAG systems suffer from answer fragmentation and poor modeling of complex inter-entity dependencies due to flat indexing structures and insufficient context awareness. To address these limitations, we propose GraphRAG—a lightweight, graph-structured RAG framework. Its core contributions are: (1) a two-tiered graph retrieval mechanism enabling fine-grained entity-relation matching and cross-level knowledge co-discovery; (2) dual-granularity retrieval integrating graph-based structural indexing with vector-based semantic representation; and (3) an incremental graph update algorithm supporting real-time knowledge evolution. Evaluated across multiple benchmarks, GraphRAG achieves significant improvements over state-of-the-art RAG methods—boosting mean reciprocal rank (MRR) by +12.7% and reducing latency by 38%. The framework’s source code and pre-trained models are publicly released and have been deployed in production environments.

Technology Category

Application Category

📝 Abstract

Retrieval-Augmented Generation (RAG) systems enhance large language models (LLMs) by integrating external knowledge sources, enabling more accurate and contextually relevant responses tailored to user needs. However, existing RAG systems have significant limitations, including reliance on flat data representations and inadequate contextual awareness, which can lead to fragmented answers that fail to capture complex inter-dependencies. To address these challenges, we propose LightRAG, which incorporates graph structures into text indexing and retrieval processes. This innovative framework employs a dual-level retrieval system that enhances comprehensive information retrieval from both low-level and high-level knowledge discovery. Additionally, the integration of graph structures with vector representations facilitates efficient retrieval of related entities and their relationships, significantly improving response times while maintaining contextual relevance. This capability is further enhanced by an incremental update algorithm that ensures the timely integration of new data, allowing the system to remain effective and responsive in rapidly changing data environments. Extensive experimental validation demonstrates considerable improvements in retrieval accuracy and efficiency compared to existing approaches. We have made our LightRAG open-source and available at the link: https://github.com/HKUDS/LightRAG

Problem

Research questions and friction points this paper is trying to address.

Improves RAG systems by using graph structures for better contextual awareness

Enhances retrieval accuracy and efficiency with dual-level knowledge discovery

Ensures timely data updates for dynamic environments via incremental algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses graph structures for text indexing

Implements dual-level retrieval system

Integrates incremental update algorithm

🔎 Similar Papers

FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research