🤖 AI Summary
Conventional RAG systems rely on semantic or lexical matching, which often fails to ensure logical relevance of retrieved passages, thereby limiting question-answering accuracy.
Method: This paper proposes a multi-hop reasoning–enhanced framework grounded in graph-structured knowledge exploration. It introduces a novel pseudo-query–driven passage graph construction method and a three-stage retrieve-reason-prune inference mechanism: (1) leveraging LLMs to generate pseudo-queries for guiding graph neural indexing; (2) discovering implicit logical pathways via multi-hop neighborhood exploration; and (3) applying dynamic reasoning-based pruning to improve both efficiency and precision.
Contribution/Results: The framework achieves a paradigm shift from semantic matching to logic-driven reasoning. On standard benchmarks, it improves answer accuracy by 76.78% and retrieval F1-score by 65.07%, significantly outperforming state-of-the-art RAG approaches.
📝 Abstract
Retrieval-Augmented Generation (RAG) systems often struggle with imperfect retrieval, as traditional retrievers focus on lexical or semantic similarity rather than logical relevance. To address this, we propose HopRAG, a novel RAG framework that augments retrieval with logical reasoning through graph-structured knowledge exploration. During indexing, HopRAG constructs a passage graph, with text chunks as vertices and logical connections established via LLM-generated pseudo-queries as edges. During retrieval, it employs a retrieve-reason-prune mechanism: starting with lexically or semantically similar passages, the system explores multi-hop neighbors guided by pseudo-queries and LLM reasoning to identify truly relevant ones. Extensive experiments demonstrate HopRAG's superiority, achieving 76.78% higher answer accuracy and 65.07% improved retrieval F1 score compared to conventional methods. The repository is available at https://github.com/LIU-Hao-2002/HopRAG.