Probing Latent Knowledge Conflict for Faithful Retrieval-Augmented Generation

📅 2025-10-14

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Existing RAG systems suffer from unfaithful responses—i.e., generated outputs inconsistent with retrieved evidence—primarily due to neglecting how large language models (LLMs) internally integrate retrieved evidence with parametric knowledge, especially under knowledge conflicts. This work introduces the first sentence-level knowledge conflict identification method based on hidden-state probing, uncovering LLMs’ representational dynamics during conflict resolution. Building on this insight, we propose a hierarchical knowledge decomposition and conflict-aware fine-tuning framework that enables precise coordination between parametric memory and external evidence. Our approach integrates fine-grained context decomposition, attention enhancement, and lightweight, conflict-driven adaptation. Evaluated on three standard benchmarks, it significantly improves generation accuracy and contextual faithfulness, outperforming multiple strong baselines—particularly in complex conflict scenarios. This work establishes a novel paradigm for trustworthy RAG by explicitly modeling and resolving knowledge conflicts within the LLM’s internal representation space.

Technology Category

Application Category

📝 Abstract

Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm to enhance the factuality of Large Language Models (LLMs). However, existing RAG systems often suffer from an unfaithfulness issue, where the model's response contradicts evidence from the retrieved context. Existing approaches to improving contextual faithfulness largely rely on external interventions, such as prompt engineering, decoding constraints, or reward-based fine-tuning. These works treat the LLM as a black box and overlook a crucial question: how does the LLM internally integrate retrieved evidence with its parametric memory, particularly under knowledge conflicts? To address this gap, we conduct a probing-based analysis of hidden-state representations in LLMs and observe three findings: knowledge integration occurs hierarchically, conflicts manifest as latent signals at the sentence level, and irrelevant context is often amplified when aligned with parametric knowledge. Building on these findings, we propose CLEAR (Conflict-Localized and Enhanced Attention for RAG), a framework that (i) decomposes context into fine-grained sentence-level knowledge, (ii) employs hidden-state probing to localize conflicting knowledge, and (iii) introduces conflict-aware fine-tuning to guide the model to accurately integrate retrieved evidence. Extensive experiments across three benchmarks demonstrate that CLEAR substantially improves both accuracy and contextual faithfulness, consistently outperforming strong baselines under diverse conflict conditions. The related resources are available at https://github.com/LinfengGao/CLEAR.

Problem

Research questions and friction points this paper is trying to address.

Detecting latent knowledge conflicts in retrieval-augmented generation systems

Improving contextual faithfulness when retrieved evidence contradicts parametric memory

Developing conflict-aware fine-tuning for accurate evidence integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposes context into fine-grained sentence-level knowledge

Uses hidden-state probing to localize conflicting knowledge

Introduces conflict-aware fine-tuning for evidence integration

🔎 Similar Papers

Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models