Trust in One Round: Confidence Estimation for Large Language Models via Structural Signals

📅 2026-02-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing confidence estimation methods for large language models exhibit fragility under distributional shifts, domain-specific text, and resource-constrained settings. This work proposes Structural Confidence, a novel framework that leverages multi-scale structural signals—such as spectral properties, local variability, and global geometry—from the trajectory of the model’s final-layer hidden states to construct a model-agnostic posterior confidence estimator. Notably, this approach requires only a single forward pass and dispenses with repeated sampling or auxiliary models. Evaluated across four heterogeneous benchmarks—FEVER, SciFact, WikiBio-hallucination, and TruthfulQA—the method consistently outperforms current baselines in both AUROC and AUPR metrics, demonstrating superior reliability and computational efficiency.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) are increasingly deployed in domains where errors carry high social, scientific, or safety costs. Yet standard confidence estimators, such as token likelihood, semantic similarity and multi-sample consistency, remain brittle under distribution shift, domain-specialised text, and compute limits. In this work, we present Structural Confidence, a single-pass, model-agnostic framework that enhances output correctness prediction based on multi-scale structural signals derived from a model's final-layer hidden-state trajectory. By combining spectral, local-variation, and global shape descriptors, our method captures internal stability patterns that are missed by probabilities and sentence embeddings. We conduct extensive, cross-domain evaluation across four heterogeneous benchmarks-FEVER (fact verification), SciFact (scientific claims), WikiBio-hallucination (biographical consistency), and TruthfulQA (truthfulness-oriented QA). Our Structural Confidence framework demonstrates strong performance compared with established baselines in terms of AUROC and AUPR. More importantly, unlike sampling-based consistency methods which require multiple stochastic generations and an auxiliary model, our approach uses a single deterministic forward pass, offering a practical basis for efficient, robust post-hoc confidence estimation in socially impactful, resource-constrained LLM applications.
Problem

Research questions and friction points this paper is trying to address.

confidence estimation
large language models
distribution shift
model reliability
post-hoc evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structural Confidence
single-pass confidence estimation
hidden-state trajectory
model-agnostic
multi-scale structural signals