Guaranteeing Knowledge Integration with Joint Decoding for Retrieval-Augmented Generation

📅 2026-04-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge in retrieval-augmented generation (RAG) where large language models struggle to effectively integrate retrieved evidence due to conflicts with their internal knowledge. To resolve this, the authors propose GuarantRAG, a novel framework that explicitly decouples reasoning from evidence integration. The approach first generates an Inner-Answer based solely on internal knowledge, then trains the model via a contrastive DPO objective—treating retrieved documents as positive references and the Inner-Answer as a negative signal—to produce a Refer-Answer faithful to external evidence. Finally, token-level joint decoding dynamically fuses both answers during inference. Evaluated across five question-answering benchmarks, GuarantRAG achieves up to a 12.1% absolute improvement in accuracy and reduces hallucination by 16.3%, substantially outperforming existing RAG methods.
📝 Abstract
Retrieval-Augmented Generation (RAG) significantly enhances Large Language Models (LLMs) by providing access to external knowledge. However, current research primarily focuses on retrieval quality, often overlooking the critical ''integration bottleneck'': even when relevant documents are retrieved, LLMs frequently fail to utilize them effectively due to conflicts with their internal parametric knowledge. In this paper, we argue that implicitly resolving this conflict in a single generation pass is suboptimal. We introduce GuarantRAG, a framework that explicitly decouples reasoning from evidence integration. First, we generate an ''Inner-Answer'' based solely on parametric knowledge to capture the model's reasoning flow. Second, to guarantee faithful evidence extraction, we generate a ''Refer-Answer'' using a novel Contrastive DPO objective. This objective treats the parametric Inner-Answer as a negative constraint and the retrieved documents as positive ground truth, forcing the model to suppress internal hallucinations in favor of external evidence during this phase. Finally, rather than naive concatenation or using the DPO trained model directly, we propose a joint decoding mechanism that dynamically fuses the logical coherence of the Inner-Answer with the factual precision of the Refer-Answer at the token level. Experiments on five QA benchmarks demonstrate that GuarantRAG improves accuracy by up to 12.1% and reduces hallucinations by 16.3% compared to standard and dynamic RAG baselines.
Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation
knowledge integration
hallucination
parametric knowledge
evidence utilization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation
knowledge integration
Contrastive DPO
joint decoding
hallucination reduction
🔎 Similar Papers
No similar papers found.