🤖 AI Summary
In multi-hop question answering, existing RAG methods suffer from two key limitations: rigid retrieval frequency and insufficient exploitation of historical knowledge. To address these, we propose MIND, a memory-augmented dynamic RAG framework. First, it introduces an uncertainty-driven retrieval triggering mechanism—novelly grounded in token-level entropy and self-attention confidence—to enable on-demand, adaptive retrieval. Second, it incorporates a cross-step fact caching module and a memory-aware filtering mechanism to support incremental storage, confidence-based reweighting, and reuse of high-confidence facts, thereby mitigating knowledge forgetting. MIND integrates prompt-guided entity extraction, token-level uncertainty modeling, and memory-enhanced generation. Evaluated on HotpotQA and 2WikiMultiHopQA, MIND achieves significant improvements in answer accuracy, reduces redundant retrievals by 47%, and simultaneously enhances multi-hop reasoning consistency and retrieval efficiency.
📝 Abstract
Multi-hop question answering (QA) requires models to retrieve and reason over multiple pieces of evidence. While Retrieval-Augmented Generation (RAG) has made progress in this area, existing methods often suffer from two key limitations: (1) fixed or overly frequent retrieval steps, and (2) ineffective use of previously retrieved knowledge. We propose MIND (Memory-Informed and INteractive Dynamic RAG), a framework that addresses these challenges through: (i) prompt-based entity extraction to identify reasoning-relevant elements, (ii) dynamic retrieval triggering based on token-level entropy and attention signals, and (iii) memory-aware filtering, which stores high-confidence facts across reasoning steps to enable consistent multi-hop generation.