🤖 AI Summary
To address the “lost-in-retrieval” problem in retrieval-augmented multi-hop question answering—where incomplete entity coverage during sub-question decomposition leads to retrieval failure and broken reasoning chains—this paper proposes ChainRAG. The framework introduces a sentence-level graph structure for hop-wise entity completion and establishes a closed-loop, chain-like mechanism integrating retrieval, query rewriting, and feedback to ensure cross-hop information propagation and retrieval completeness. It further designs entity-aware sub-question rewriting and multi-hop answer aggregation strategies. ChainRAG is compatible with mainstream large language models, including GPT-4o-mini, Qwen2.5-72B, and GLM-4-Plus. Extensive experiments on MuSiQue, 2Wiki, and HotpotQA demonstrate consistent superiority over state-of-the-art baselines, achieving significant improvements in both answer accuracy and retrieval efficiency.
📝 Abstract
In this paper, we identify a critical problem,"lost-in-retrieval", in retrieval-augmented multi-hop question answering (QA): the key entities are missed in LLMs' sub-question decomposition."Lost-in-retrieval"significantly degrades the retrieval performance, which disrupts the reasoning chain and leads to the incorrect answers. To resolve this problem, we propose a progressive retrieval and rewriting method, namely ChainRAG, which sequentially handles each sub-question by completing missing key entities and retrieving relevant sentences from a sentence graph for answer generation. Each step in our retrieval and rewriting process builds upon the previous one, creating a seamless chain that leads to accurate retrieval and answers. Finally, all retrieved sentences and sub-question answers are integrated to generate a comprehensive answer to the original question. We evaluate ChainRAG on three multi-hop QA datasets$unicode{x2013}$MuSiQue, 2Wiki, and HotpotQA$unicode{x2013}$using three large language models: GPT4o-mini, Qwen2.5-72B, and GLM-4-Plus. Empirical results demonstrate that ChainRAG consistently outperforms baselines in both effectiveness and efficiency.