INTRYGUE: Induction-Aware Entropy Gating for Reliable RAG Uncertainty Estimation

πŸ“… 2026-03-23
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Standard entropy-based uncertainty quantification (UQ) methods fail in retrieval-augmented generation (RAG) because activation of the model’s induction heads can cause correct answers to be misclassified as highly uncertain. This work uncovers, for the first time, a β€œtug-of-war” effect between induction heads and entropy neurons in RAG systems and proposes a mechanism-driven, induction-aware entropy gating method that calibrates predictive entropy using interpretable internal contextual signals. Evaluated across four RAG benchmarks and six open-source large language models (ranging from 4B to 13B parameters), the proposed approach consistently matches or outperforms existing UQ techniques, substantially improving hallucination detection performance.

Technology Category

Application Category

πŸ“ Abstract
While retrieval-augmented generation (RAG) significantly improves the factual reliability of LLMs, it does not eliminate hallucinations, so robust uncertainty quantification (UQ) remains essential. In this paper, we reveal that standard entropy-based UQ methods often fail in RAG settings due to a mechanistic paradox. An internal "tug-of-war" inherent to context utilization appears: while induction heads promote grounded responses by copying the correct answer, they collaterally trigger the previously established "entropy neurons". This interaction inflates predictive entropy, causing the model to signal false uncertainty on accurate outputs. To address this, we propose INTRYGUE (Induction-Aware Entropy Gating for Uncertainty Estimation), a mechanistically grounded method that gates predictive entropy based on the activation patterns of induction heads. Evaluated across four RAG benchmarks and six open-source LLMs (4B to 13B parameters), INTRYGUE consistently matches or outperforms a wide range of UQ baselines. Our findings demonstrate that hallucination detection in RAG benefits from combining predictive uncertainty with interpretable, internal signals of context utilization.
Problem

Research questions and friction points this paper is trying to address.

retrieval-augmented generation
uncertainty quantification
hallucination detection
entropy-based UQ
induction heads
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation
Uncertainty Quantification
Induction Heads
Entropy Gating
Hallucination Detection
πŸ”Ž Similar Papers
No similar papers found.
A
Alexandra Bazarova
Applied AI Institute
A
Andrei Volodichev
Applied AI Institute
Daria Kotova
Daria Kotova
masters student
Alexey Zaytsev
Alexey Zaytsev
Associate professor at BIMSA
Deep learningMachine learningStatistics