INTRYGUE: Induction-Aware Entropy Gating for Reliable RAG Uncertainty Estimation

📅 2026-03-23

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Standard entropy-based uncertainty quantification (UQ) methods fail in retrieval-augmented generation (RAG) because activation of the model’s induction heads can cause correct answers to be misclassified as highly uncertain. This work uncovers, for the first time, a “tug-of-war” effect between induction heads and entropy neurons in RAG systems and proposes a mechanism-driven, induction-aware entropy gating method that calibrates predictive entropy using interpretable internal contextual signals. Evaluated across four RAG benchmarks and six open-source large language models (ranging from 4B to 13B parameters), the proposed approach consistently matches or outperforms existing UQ techniques, substantially improving hallucination detection performance.

Technology Category

Application Category

📝 Abstract

While retrieval-augmented generation (RAG) significantly improves the factual reliability of LLMs, it does not eliminate hallucinations, so robust uncertainty quantification (UQ) remains essential. In this paper, we reveal that standard entropy-based UQ methods often fail in RAG settings due to a mechanistic paradox. An internal "tug-of-war" inherent to context utilization appears: while induction heads promote grounded responses by copying the correct answer, they collaterally trigger the previously established "entropy neurons". This interaction inflates predictive entropy, causing the model to signal false uncertainty on accurate outputs. To address this, we propose INTRYGUE (Induction-Aware Entropy Gating for Uncertainty Estimation), a mechanistically grounded method that gates predictive entropy based on the activation patterns of induction heads. Evaluated across four RAG benchmarks and six open-source LLMs (4B to 13B parameters), INTRYGUE consistently matches or outperforms a wide range of UQ baselines. Our findings demonstrate that hallucination detection in RAG benefits from combining predictive uncertainty with interpretable, internal signals of context utilization.

Problem

Research questions and friction points this paper is trying to address.

retrieval-augmented generation

uncertainty quantification

hallucination detection

entropy-based UQ

induction heads

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation

Uncertainty Quantification

Induction Heads