Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation

📅 2025-03-11

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

To address “context hallucination”—a critical issue in large language models (LLMs) where generated summaries or open-domain QA responses ignore input context—we propose a test-time dynamic attention editing method. Our approach introduces a gradient-guided attention reweighting mechanism, coupled with a lightweight hallucination-proneness classifier to identify high-risk attention heads; interventions are then applied along learnable, task-specific editing directions for precise, low-overhead correction. Crucially, the method requires no fine-tuning or additional training. Experiments demonstrate substantial improvements in context faithfulness: a 10% reduction in hallucination rate on XSum and marked gains in open-domain QA generalization. Moreover, our method achieves 7× speedup over state-of-the-art alternatives, offering a favorable trade-off among effectiveness, computational efficiency, and deployment practicality.

Technology Category

Application Category

📝 Abstract

In tasks like summarization and open-book question answering (QA), Large Language Models (LLMs) often encounter"contextual hallucination", where they produce irrelevant or incorrect responses despite having access to accurate source information. This typically occurs because these models tend to prioritize self-generated content over the input context, causing them to disregard pertinent details. To address this challenge, we introduce a novel method called"Guided Attention Map Editing"(GAME), which dynamically adjusts attention maps to improve contextual relevance. During inference, GAME employs a trained classifier to identify attention maps prone to inducing hallucinations and executes targeted interventions. These interventions, guided by gradient-informed"edit directions'', strategically redistribute attention weights across various heads to effectively reduce hallucination. Comprehensive evaluations on challenging summarization and open-book QA tasks show that GAME consistently reduces hallucinations across a variety of open-source models. Specifically, GAME reduces hallucinations by 10% in the XSum summarization task while achieving a 7X speed-up in computational efficiency compared to the state-of-the-art baselines.

Problem

Research questions and friction points this paper is trying to address.

Mitigates contextual hallucination in LLMs

Improves attention map relevance dynamically

Reduces hallucinations and enhances computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic attention map adjustment for contextual relevance

Gradient-guided interventions to reduce hallucinations

Classifier identifies and edits problematic attention maps

🔎 Similar Papers

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback