Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address “context hallucination”—a critical issue in large language models (LLMs) where generated summaries or open-domain QA responses ignore input context—we propose a test-time dynamic attention editing method. Our approach introduces a gradient-guided attention reweighting mechanism, coupled with a lightweight hallucination-proneness classifier to identify high-risk attention heads; interventions are then applied along learnable, task-specific editing directions for precise, low-overhead correction. Crucially, the method requires no fine-tuning or additional training. Experiments demonstrate substantial improvements in context faithfulness: a 10% reduction in hallucination rate on XSum and marked gains in open-domain QA generalization. Moreover, our method achieves 7× speedup over state-of-the-art alternatives, offering a favorable trade-off among effectiveness, computational efficiency, and deployment practicality.

Technology Category

Application Category

📝 Abstract
In tasks like summarization and open-book question answering (QA), Large Language Models (LLMs) often encounter"contextual hallucination", where they produce irrelevant or incorrect responses despite having access to accurate source information. This typically occurs because these models tend to prioritize self-generated content over the input context, causing them to disregard pertinent details. To address this challenge, we introduce a novel method called"Guided Attention Map Editing"(GAME), which dynamically adjusts attention maps to improve contextual relevance. During inference, GAME employs a trained classifier to identify attention maps prone to inducing hallucinations and executes targeted interventions. These interventions, guided by gradient-informed"edit directions'', strategically redistribute attention weights across various heads to effectively reduce hallucination. Comprehensive evaluations on challenging summarization and open-book QA tasks show that GAME consistently reduces hallucinations across a variety of open-source models. Specifically, GAME reduces hallucinations by 10% in the XSum summarization task while achieving a 7X speed-up in computational efficiency compared to the state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

Mitigates contextual hallucination in LLMs
Improves attention map relevance dynamically
Reduces hallucinations and enhances computational efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic attention map adjustment for contextual relevance
Gradient-guided interventions to reduce hallucinations
Classifier identifies and edits problematic attention maps
Y
Yu Wang
University of California, Santa Barbara, Santa Barbara, CA
J
Jiaxin Zhang
Intuit AI Research, Mountain View, CA
X
Xiang Gao
Intuit AI Research, Mountain View, CA
Wendi Cui
Wendi Cui
Intuit, Carnegie Mellon University
LLMMachine LearningSearch
P
Peng Li
University of California, Santa Barbara, Santa Barbara, CA
K
Kamalika Das
Intuit AI Research, Mountain View, CA