🤖 AI Summary
While Retrieval-Augmented Generation (RAG) improves factual accuracy, erroneous retrieval often induces “hallucinations about hallucinations”—compounding factual errors through flawed evidence.
Method: We propose DRAG, a training-free debate-enhanced RAG framework that introduces the first dual-stage multi-agent debate architecture: (1) in retrieval, dynamic verification and role-asymmetric agents (proponent, opponent, adjudicator) jointly assess candidate documents; (2) in generation, adversarial reasoning integrates conflicting evidence to refine outputs.
Contribution/Results: DRAG requires zero parameter fine-tuning, achieves strong robustness against retrieval noise, and ensures high factual consistency. Experiments demonstrate significantly improved retrieval reliability and reduced hallucination rates across diverse benchmarks; average factual accuracy increases by 12.7%. DRAG establishes a lightweight, interpretable, and training-free paradigm for trustworthy RAG.
📝 Abstract
Retrieval-Augmented Generation (RAG) enhances factual accuracy by integrating external knowledge, yet it introduces a critical issue: erroneous or biased retrieval can mislead generation, compounding hallucinations, a phenomenon we term Hallucination on Hallucination. To address this, we propose Debate-Augmented RAG (DRAG), a training-free framework that integrates Multi-Agent Debate (MAD) mechanisms into both retrieval and generation stages. In retrieval, DRAG employs structured debates among proponents, opponents, and judges to refine retrieval quality and ensure factual reliability. In generation, DRAG introduces asymmetric information roles and adversarial debates, enhancing reasoning robustness and mitigating factual inconsistencies. Evaluations across multiple tasks demonstrate that DRAG improves retrieval reliability, reduces RAG-induced hallucinations, and significantly enhances overall factual accuracy. Our code is available at https://github.com/Huenao/Debate-Augmented-RAG.