MiniRAG: Towards Extremely Simple Retrieval-Augmented Generation

📅 2025-01-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the weak semantic understanding and severe performance degradation of small language models (SLMs) in resource-constrained retrieval-augmented generation (RAG), this paper proposes the first minimalist, highly efficient RAG framework specifically designed for SLMs. Our method introduces: (1) a semantics-aware heterogeneous graph index that jointly models named entities and text chunks; (2) a topology-driven lightweight retrieval mechanism that eliminates reliance on complex semantic modeling; and (3) an on-device query optimization strategy. Experiments demonstrate that our approach matches the effectiveness of large-model-based RAG while reducing storage overhead to only 25%. Moreover, we introduce the first lightweight RAG benchmark tailored to realistic on-device complex queries—enabling simultaneous breakthroughs in both effectiveness and efficiency.

Technology Category

Application Category

📝 Abstract
The growing demand for efficient and lightweight Retrieval-Augmented Generation (RAG) systems has highlighted significant challenges when deploying Small Language Models (SLMs) in existing RAG frameworks. Current approaches face severe performance degradation due to SLMs' limited semantic understanding and text processing capabilities, creating barriers for widespread adoption in resource-constrained scenarios. To address these fundamental limitations, we present MiniRAG, a novel RAG system designed for extreme simplicity and efficiency. MiniRAG introduces two key technical innovations: (1) a semantic-aware heterogeneous graph indexing mechanism that combines text chunks and named entities in a unified structure, reducing reliance on complex semantic understanding, and (2) a lightweight topology-enhanced retrieval approach that leverages graph structures for efficient knowledge discovery without requiring advanced language capabilities. Our extensive experiments demonstrate that MiniRAG achieves comparable performance to LLM-based methods even when using SLMs while requiring only 25% of the storage space. Additionally, we contribute a comprehensive benchmark dataset for evaluating lightweight RAG systems under realistic on-device scenarios with complex queries. We fully open-source our implementation and datasets at: https://github.com/HKUDS/MiniRAG.
Problem

Research questions and friction points this paper is trying to address.

Resource-limited Conditions
Small Model Retrieval
Semantic Understanding Limitations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Smart Indexing
Graph-based Retrieval
Miniaturized Model