Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Information Scores

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Large language models often exhibit overconfidence even when incorrect, highlighting the urgent need for efficient and transferable uncertainty estimation methods. This work proposes a lightweight, training-free approach that derives instance-level uncertainty scores by analyzing cross-layer local consistency among internal representations through a single forward pass. The method is computationally efficient, model-agnostic, and readily applicable across diverse architectures and settings. Experimental results demonstrate that it matches or surpasses existing detection techniques on in-distribution data and significantly outperforms baselines under challenging conditions—such as out-of-distribution generalization and 4-bit quantization—with improvements of up to 2.86 points in AUPRC and 21.02 points in Brier score. Additionally, the analysis reveals notable differences across models in how uncertainty is encoded across layers.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are often confidently wrong, making reliable uncertainty estimation (UE) essential. Output-based heuristics are cheap but brittle, while probing internal representations is effective yet high-dimensional and hard to transfer. We propose a compact, per-instance UE method that scores cross-layer agreement patterns in internal representations using a single forward pass. Across three models, our method matches probing in-distribution, with mean diagonal differences of at most $-1.8$ AUPRC percentage points and $+4.9$ Brier score points. Under cross-dataset transfer, it consistently outperforms probing, achieving off-diagonal gains up to $+2.86$ AUPRC and $+21.02$ Brier points. Under 4-bit weight-only quantization, it remains robust, improving over probing by $+1.94$ AUPRC points and $+5.33$ Brier points on average. Beyond performance, examining specific layer--layer interactions reveals differences in how disparate models encode uncertainty. Altogether, our UE method offers a lightweight, compact means to capture transferable uncertainty in LLMs.

Problem

Research questions and friction points this paper is trying to address.

uncertainty estimation

large language models

internal representations

transferability

model confidence

Innovation

Methods, ideas, or system contributions that make the work stand out.

uncertainty estimation

large language models

intra-layer information