🤖 AI Summary
This work addresses the challenge in multi-hop question answering where existing iterative retrieval methods struggle to assess evidence sufficiency and often introduce redundant or noisy information. To overcome this, the authors propose the S2G-RAG framework, which employs an explicit controller, S2G-Judge, to evaluate after each retrieval round whether the accumulated evidence is sufficient for answering the question. If not, the controller outputs a structured information gap to guide precise subsequent retrieval and maintains sentence-level evidence context to mitigate noise accumulation. Notably, S2G-RAG requires neither modifications to the underlying search engine nor retraining of the generator, enabling lightweight integration into existing RAG systems. Experimental results demonstrate that S2G-RAG significantly improves answer accuracy and multi-turn retrieval robustness on TriviaQA, HotpotQA, and 2WikiMultiHopQA benchmarks.
📝 Abstract
Retrieval-Augmented Generation (RAG) grounds language models in external evidence, but multi-hop question answering remains difficult because iterative pipelines must control what to retrieve next and when the available evidence is adequate. In practice, systems may answer from incomplete evidence chains, or they may accumulate redundant or distractor-heavy text that interferes with later retrieval and reasoning. We propose S2G-RAG (Structured Sufficiency and Gap-judging RAG), an iterative framework with an explicit controller, S2G-Judge. At each turn, S2G-Judge predicts whether the current evidence memory supports answering and, if not, outputs structured gap items that describe the missing information. These gap items are then mapped into the next retrieval query, producing stable multi-turn retrieval trajectories. To reduce noise accumulation, S2G-RAG maintains a sentence-level Evidence Context by extracting a compact set of relevant sentences from retrieved documents. Experiments on TriviaQA, HotpotQA, and 2WikiMultiHopQA show that S2G-RAG improves multi-hop QA performance and robustness under multi-turn retrieval. Furthermore, S2G-RAG can be integrated into existing RAG pipelines as a lightweight component, without modifying the search engine or retraining the generator.