🤖 AI Summary
Current uncertainty quantification (UQ) methods for large language models erroneously equate the internal consistency of generated outputs with external correctness, failing to detect “confident hallucinations” and thereby creating a false sense of reliability in deployment. This work reveals that existing UQ approaches are fundamentally unsupervised clustering procedures, suffering from three critical pathologies: sensitivity to hyperparameters, conflation of output stability with factual truthfulness, and lack of dependence on ground-truth labels. Through theoretical analysis and empirical evaluation, we argue that effective UQ must move beyond internal model states and anchor instead to objective truth. We propose a truth-oriented UQ paradigm, outline concrete mechanisms for improvement, and advocate for a restructured evaluation framework—laying both a theoretical foundation and a research roadmap for developing reliable uncertainty quantification in large language models.
📝 Abstract
Uncertainty Quantification (UQ) is widely regarded as the primary safeguard for deploying Large Language Models (LLMs) in high-stakes domains. However, we argue that the field suffers from a category error: mainstream UQ methods for LLMs are just unsupervised clustering algorithms. We demonstrate that most current approaches inherently quantify the internal consistency of the model's generations rather than their external correctness. Consequently, current methods are fundamentally blind to factual reality and fail to detect ``confident hallucinations,'' where models exhibit high confidence in stable but incorrect answers. Therefore, the current UQ methods may create a deceptive sense of safety when deploying the models with uncertainty. In detail, we identify three critical pathologies resulting from this dependence on internal state: a hyperparameter sensitivity crisis that renders deployment unsafe, an internal evaluation cycle that conflates stability with truth, and a fundamental lack of ground truth that forces reliance on unstable proxy metrics to evaluate uncertainty. To resolve this impasse, we advocate for a paradigm shift to UQ and outline a roadmap for the research community to adopt better evaluation metrics and settings, implement mechanism changes for native uncertainty, and anchor verification in objective truth, ensuring that model confidence serves as a reliable proxy for reality.