🤖 AI Summary
Large language models (LLMs) frequently exhibit factual inconsistency and hallucination in retrieval-augmented generation (RAG) due to suboptimal integration of external knowledge. To address this, we introduce the novel concept of *RAG-competence* and propose an *entity-context deviation* metric. Building upon these, we present RADIANT—a framework that pioneers the adaptation of direct preference optimization (DPO) to RAG settings. RADIANT jointly incorporates entity alignment mechanisms and context fidelity modeling to achieve end-to-end alignment between retrieved evidence and generated outputs. Empirical evaluation demonstrates substantial improvements in factual consistency across diverse LLMs under challenging conditions—including noisy retrievals and knowledge conflicts—reducing hallucination rates by up to 27.3%. Moreover, RADIANT exhibits superior robustness compared to existing RAG optimization approaches.
📝 Abstract
As Large Language Models (LLMs) continue to advance, Retrieval-Augmented Generation (RAG) has emerged as a vital technique to enhance factual accuracy by integrating external knowledge into the generation process. However, LLMs often fail to faithfully integrate retrieved evidence into their generated responses, leading to factual inconsistencies. To quantify this gap, we introduce Entity-Context Divergence (ECD), a metric that measures the extent to which retrieved information is accurately reflected in model outputs. We systematically evaluate contemporary LLMs on their ability to preserve factual consistency in retrieval-augmented settings, a capability we define as RAG-ability. Our empirical analysis reveals that RAG-ability remains low across most LLMs, highlighting significant challenges in entity retention and context fidelity. This paper introduces Radiant (Retrieval AugmenteD entIty-context AligNmenT), a novel framework that merges RAG with alignment designed to optimize the interplay between retrieved evidence and generated content. Radiant extends Direct Preference Optimization (DPO) to teach LLMs how to integrate provided additional information into subsequent generations. As a behavior correction mechanism, Radiant boosts RAG performance across varied retrieval scenarios, such as noisy web contexts, knowledge conflicts, and hallucination reduction. This enables more reliable, contextually grounded, and factually coherent content generation.