🤖 AI Summary
Large language models (LLMs) frequently generate hallucinations—factually incorrect or contextually inconsistent outputs. To address this, we propose *Induced Contrastive Decoding* (ICD), a novel paradigm that first identifies and selectively attenuates attention in critical heads based on predictive importance, thereby controllably inducing high-quality hallucinated samples; it then performs contrastive decoding between the induced hallucination and the original generation to enhance factual fidelity. ICD is the first method enabling head-level, interpretable, and controllable hallucination induction without external knowledge bases or model fine-tuning. Evaluated on context completion, reading comprehension, and question answering tasks, ICD significantly improves factual accuracy and contextual faithfulness, outperforming state-of-the-art hallucination mitigation and contrastive decoding approaches.
📝 Abstract
Large Language Models (LLMs) often generate hallucinations, producing outputs that are contextually inaccurate or factually incorrect. We introduce HICD, a novel method designed to induce hallucinations for contrastive decoding to mitigate hallucinations. Unlike existing contrastive decoding methods, HICD selects attention heads crucial to the model's prediction as inducing heads, then induces hallucinations by dispersing attention of these inducing heads and compares the hallucinated outputs with the original outputs to obtain the final result. Our approach significantly improves performance on tasks requiring contextual faithfulness, such as context completion, reading comprehension, and question answering. It also improves factuality in tasks requiring accurate knowledge recall. We demonstrate that our inducing heads selection and attention dispersion method leads to more"contrast-effective"hallucinations for contrastive decoding, outperforming other hallucination-inducing methods. Our findings provide a promising strategy for reducing hallucinations by inducing hallucinations in a controlled manner, enhancing the performance of LLMs in a wide range of tasks.