🤖 AI Summary
Existing adaptive retrieval-augmented generation (ARAG) systems excel at deep, single-source retrieval but struggle to anticipate and jointly regulate multi-source knowledge features, resulting in limited controllability and adaptability in cross-source retrieval. To address this, we propose MSPR—a Multi-Source Adaptive Retrieval-Augmented Generation framework—that pioneers joint decision-making on *when* to retrieve, *what* to retrieve, and *which source* to use. Its core innovations are: (1) a dual-track retrieval mechanism driven by chain-of-thought reasoning and retrieval preference modeling; (2) a dynamic retrieval-action optimization strategy guided by answer-level feedback; and (3) a complementary primary-secondary source selection paradigm. MSPR integrates reasoning-aware retrieval, preference-aware source modeling, and feedback-driven refinement. Evaluated on three benchmark datasets, MSPR achieves significant improvements in answer accuracy and factual consistency while substantially reducing hallucination rates, outperforming state-of-the-art ARAG methods across all metrics.
📝 Abstract
Retrieval-Augmented Generation (RAG) has emerged as a reliable external knowledge augmentation technique to mitigate hallucination issues and parameterized knowledge limitations in Large Language Models (LLMs). Existing Adaptive RAG (ARAG) systems struggle to effectively explore multiple retrieval sources due to their inability to select the right source at the right time. To address this, we propose a multi-source ARAG framework, termed MSPR, which synergizes reasoning and preference-driven retrieval to adaptive decide"when and what to retrieve"and"which retrieval source to use". To better adapt to retrieval sources of differing characteristics, we also employ retrieval action adjustment and answer feedback strategy. They enable our framework to fully explore the high-quality primary source while supplementing it with secondary sources at the right time. Extensive and multi-dimensional experiments conducted on three datasets demonstrate the superiority and effectiveness of MSPR.