Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization

📅 2025-01-23

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This paper addresses the issue of answer unfaithfulness to retrieved contexts in Retrieval-Augmented Generation (RAG) for Long-Form Question Answering (LFQA). Methodologically, it first identifies a strong correlation between retrieval attention heads and generation faithfulness; introduces a masked retrieval head mechanism to synthesize unfaithful samples; and proposes control tokens to steer contrastive decoding between faithful and unfaithful answers. It also establishes GroundBench—the first LFQA benchmark explicitly designed to evaluate context faithfulness. Experiments demonstrate that RHIO significantly improves faithfulness (+12.7% on GroundBench), outperforming GPT-4o while preserving answer quality and factual consistency. Key contributions include: (1) a retrieval-head-induced optimization mechanism, (2) a control-token-driven contrastive decoding paradigm, and (3) the GroundBench evaluation benchmark.

Technology Category

Application Category

📝 Abstract

Ensuring contextual faithfulness in retrieval-augmented large language models (LLMs) is crucial for building trustworthy information-seeking systems, particularly in long-form question-answering (LFQA) scenarios. In this work, we identify a salient correlation between LFQA faithfulness and retrieval heads, a set of attention heads responsible for retrieving contextual information. Leveraging this insight, we propose RHIO, a framework designed to teach LLMs to explicitly discriminate between faithful and unfaithful generations. RHIO first augments unfaithful samples that simulate realistic model-intrinsic errors by selectively masking retrieval heads. Then, these samples are incorporated into joint training, enabling the model to distinguish unfaithful outputs from faithful ones conditioned on control tokens. Furthermore, these control tokens are leveraged to self-induce contrastive outputs, amplifying their difference through contrastive decoding. Additionally, to facilitate the evaluation of contextual faithfulness, we also introduce GroundBench, a comprehensive benchmark compiled from five existing LFQA datasets. Extensive experimental results on GroundBench demonstrate that RHIO significantly improves faithfulness, even outperforming GPT-4o.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Improvement

Faithful Response Generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

RHIO

Contrastive Training

Faithfulness Enhancement

🔎 Similar Papers

Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search

2024-02-19arXiv.orgCitations: 0

Nvidia

30 USD - 94 USD

US, CA, Santa Clara

Research Scientist Intern, Multimodal AI (PhD)