π€ AI Summary
This work addresses the issue in retrieval-augmented generation (RAG) where a modelβs internal parametric knowledge often overrides external retrieved evidence, leading to unfaithful outputs and hallucinations. The authors propose a label-free method that identifies deep hidden states biased by parametric priors by comparing forward-pass logits with and without retrieved context. Context-aware corrections are then applied directly at these hidden layers, circumventing the limitations of decoding-time adjustments or weight editing. Evaluated on question answering and summarization tasks, the approach significantly improves output faithfulness and effectively suppresses hallucinations, outperforming multiple strong baselines.
π Abstract
Retrieval-Augmented Generation (RAG) often struggles with knowledge conflicts, where model-internal parametric knowledge overrides retrieved evidence, leading to unfaithful outputs. Existing approaches are often limited, relying either on superficial decoding adjustments or weight editing that necessitates ground-truth targets. Through layer-wise analysis, we attribute this failure to a parametric suppression phenomenon: specifically, in deep layers, certain FFN layers overwrite context-sensitive representations with memorized priors. To address this, we propose CoRect (Context-Aware Logit Contrast for Hidden State Rectification). By contrasting logits from contextualized and non-contextualized forward passes, CoRect identifies layers that exhibit high parametric bias without requiring ground-truth labels. It then rectifies the hidden states to preserve evidence-grounded information. Across question answering (QA) and summarization benchmarks, CoRect consistently improves faithfulness and reduces hallucinations compared to strong baselines.