🤖 AI Summary
In multi-hop retrieval-augmented generation (RAG), initial retrieval often misses bridging facts, leading to answer failures; existing error-correction methods rely on context expansion, which risks diluting critical evidence with irrelevant information. This paper proposes SEAL-RAG—a training-free controller that introduces the “replace, not expand” paradigm: given a fixed retrieval depth $k$, it models evidence gaps via gap specification, then actively replaces distracting passages—rather than blindly expanding context—using entity-anchored extraction and entity-first ranking. It further incorporates dynamic micro-query generation and a Search→Extract→Assess→Loop iterative mechanism for adaptive evidence set re-ranking. On HotpotQA, SEAL-RAG improves answer accuracy by 3–13 percentage points and evidence precision by 12–18 percentage points. On 2WikiMultiHopQA, it surpasses Adaptive-$k$ by 8.0 percentage points in answer accuracy and achieves 96% evidence precision—compared to CRAG’s 22%.
📝 Abstract
Retrieval-Augmented Generation (RAG) systems often fail on multi-hop queries when the initial retrieval misses a bridge fact. Prior corrective approaches, such as Self-RAG, CRAG, and Adaptive-$k$, typically address this by extit{adding} more context or pruning existing lists. However, simply expanding the context window often leads to extbf{context dilution}, where distractors crowd out relevant information. We propose extbf{SEAL-RAG}, a training-free controller that adopts a extbf{``replace, don't expand''} strategy to fight context dilution under a fixed retrieval depth $k$. SEAL executes a ( extbf{S}earch $
ightarrow$ extbf{E}xtract $
ightarrow$ extbf{A}ssess $
ightarrow$ extbf{L}oop) cycle: it performs on-the-fly, entity-anchored extraction to build a live extit{gap specification} (missing entities/relations), triggers targeted micro-queries, and uses extit{entity-first ranking} to actively swap out distractors for gap-closing evidence. We evaluate SEAL-RAG against faithful re-implementations of Basic RAG, CRAG, Self-RAG, and Adaptive-$k$ in a shared environment on extbf{HotpotQA} and extbf{2WikiMultiHopQA}. On HotpotQA ($k=3$), SEAL improves answer correctness by extbf{+3--13 pp} and evidence precision by extbf{+12--18 pp} over Self-RAG. On 2WikiMultiHopQA ($k=5$), it outperforms Adaptive-$k$ by extbf{+8.0 pp} in accuracy and maintains extbf{96%} evidence precision compared to 22% for CRAG. These gains are statistically significant ($p<0.001$). By enforcing fixed-$k$ replacement, SEAL yields a predictable cost profile while ensuring the top-$k$ slots are optimized for precision rather than mere breadth. We release our code and data at https://github.com/mosherino/SEAL-RAG.