Epistemic Integrity in Large Language Models

📅 2024-11-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This paper addresses the pervasive “cognitive miscalibration” problem in large language models (LLMs)—a systematic mismatch between asserted confidence and actual accuracy. We propose the first assertion-strength quantification framework explicitly designed for cognitive calibration. To support it, we construct the first human-annotated cognitive calibration dataset and introduce a measurement methodology grounded in semantic intensity and confidence mapping. Through cross-dataset validation, human-AI collaborative evaluation, and rigorous statistical significance testing, we systematically uncover widespread overconfidence across mainstream LLMs. Experiments demonstrate that our framework reduces assertion misclassification rates by over 50% and significantly improves trustworthiness diagnostics across multiple benchmarks. This work establishes the first quantifiable, diagnosable, and rectifiable paradigm for LLM cognitive calibration—laying a foundational methodological basis for trustworthy AI.

Technology Category

Application Category

📝 Abstract

Large language models are increasingly relied upon as sources of information, but their propensity for generating false or misleading statements with high confidence poses risks for users and society. In this paper, we confront the critical problem of epistemic miscalibration $unicode{x2013}$ where a model's linguistic assertiveness fails to reflect its true internal certainty. We introduce a new human-labeled dataset and a novel method for measuring the linguistic assertiveness of Large Language Models (LLMs) which cuts error rates by over 50% relative to previous benchmarks. Validated across multiple datasets, our method reveals a stark misalignment between how confidently models linguistically present information and their actual accuracy. Further human evaluations confirm the severity of this miscalibration. This evidence underscores the urgent risk of the overstated certainty LLMs hold which may mislead users on a massive scale. Our framework provides a crucial step forward in diagnosing this miscalibration, offering a path towards correcting it and more trustworthy AI across domains.

Problem

Research questions and friction points this paper is trying to address.

LLMs generate false information with high confidence

Misalignment between model assertiveness and actual accuracy

Overstated certainty in LLMs risks misleading users

Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-labeled dataset measures linguistic assertiveness

Novel method cuts error rates by 50%

Framework diagnoses and corrects epistemic miscalibration

🔎 Similar Papers

No similar papers found.