π€ AI Summary
Graph-based RAG systems face two key bottlenecks: heavy reliance on costly entity linking and relation extraction, and retrieval redundancy and subgraph sparsity caused by the mismatch between semantic similarity and semantic relevance. This paper proposes a lightweight, graph-free RAG framework that eliminates explicit graph construction and traversal. It introduces an entity-aware context selection mechanism: an entity-to-chunk embedding mapping table is built offline, and at query time, entities are dynamically identified to re-rank relevant text chunks. We further propose RITUβa novel metric quantifying retrieval compactness. Evaluated on multiple QA benchmarks, our method significantly outperforms strong flat and graph-based baselines, achieving higher answer accuracy while drastically reducing index size; RITU drops to 16.31 (versus >56 for graph-based methods), enabling efficient, precise, and low-overhead retrieval-augmented generation.
π Abstract
Retrieval-Augmented Generation (RAG) enhances language models by incorporating external knowledge at inference time. However, graph-based RAG systems often suffer from structural overhead and imprecise retrieval: they require costly pipelines for entity linking and relation extraction, yet frequently return subgraphs filled with loosely related or tangential content. This stems from a fundamental flaw -- semantic similarity does not imply semantic relevance. We introduce SlimRAG, a lightweight framework for retrieval without graphs. SlimRAG replaces structure-heavy components with a simple yet effective entity-aware mechanism. At indexing time, it constructs a compact entity-to-chunk table based on semantic embeddings. At query time, it identifies salient entities, retrieves and scores associated chunks, and assembles a concise, contextually relevant input -- without graph traversal or edge construction. To quantify retrieval efficiency, we propose Relative Index Token Utilization (RITU), a metric measuring the compactness of retrieved content. Experiments across multiple QA benchmarks show that SlimRAG outperforms strong flat and graph-based baselines in accuracy while reducing index size and RITU (e.g., 16.31 vs. 56+), highlighting the value of structure-free, entity-centric context selection. The code will be released soon. https://github.com/continue-ai-company/SlimRAG