🤖 AI Summary
To address the low recall accuracy and inefficiency of conventional retrievers in multi-hop question answering—particularly under complex reasoning paths—this paper proposes GraphRAG, a graph-augmented retrieval framework. GraphRAG constructs dynamic semantic graphs and synergistically integrates BM25 retrieval, graph traversal algorithms, an LLM-driven retrieval decision agent, and iterative re-ranking strategies to enable structured modeling and collaborative optimization of the retrieval process. Crucially, it is the first work to deeply couple graph-structured modeling with retrieval agents, thereby transcending the limitations of single-pass retrieval paradigms. Evaluated on three standard multi-hop QA benchmarks—MuSiQue, 2WikiMultiHopQA, and HotpotQA—GraphRAG achieves state-of-the-art performance: notably, it improves accuracy on MuSiQue by over 10%, while substantially reducing both token consumption and the number of retrieval iterations.
📝 Abstract
Retrieval-augmented generation systems rely on effective document retrieval capabilities. By design, conventional sparse or dense retrievers face challenges in multi-hop retrieval scenarios. In this paper, we present GeAR, which advances RAG performance through two key innovations: (i) graph expansion, which enhances any conventional base retriever, such as BM25, and (ii) an agent framework that incorporates graph expansion. Our evaluation demonstrates GeAR's superior retrieval performance on three multi-hop question answering datasets. Additionally, our system achieves state-of-the-art results with improvements exceeding 10% on the challenging MuSiQue dataset, while requiring fewer tokens and iterations compared to other multi-step retrieval systems.