How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models

📅 2025-07-03

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Large language models (LLMs) exhibit an apparent paradox—overconfidence in initial judgments followed by rapid self-doubt upon challenge—yet the underlying cognitive mechanism remains unexplained. Method: We introduce a novel experimental paradigm that decouples answer selection from confidence calibration, revealing a “choice-supportive bias”: LLMs reinforce selected answers, systematically undervalue counterevidence, and overreact to inconsistent feedback. Using controlled prompting and multi-turn feedback analysis, we evaluate this bias across Gemma-3, GPT-4o, and o1-preview. Contribution/Results: We demonstrate its cross-model and cross-domain robustness; empirical decision updates significantly deviate from Bayesian norms and are not attributable to memory interference. This work provides the first systematic deconstruction of the “stubborn-yet-fickle” paradox in LLMs, offering both theoretical insight into their metacognitive limitations and a methodological framework for improving calibration mechanisms.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) exhibit strikingly conflicting behaviors: they can appear steadfastly overconfident in their initial answers whilst at the same time being prone to excessive doubt when challenged. To investigate this apparent paradox, we developed a novel experimental paradigm, exploiting the unique ability to obtain confidence estimates from LLMs without creating memory of their initial judgments -- something impossible in human participants. We show that LLMs -- Gemma 3, GPT4o and o1-preview -- exhibit a pronounced choice-supportive bias that reinforces and boosts their estimate of confidence in their answer, resulting in a marked resistance to change their mind. We further demonstrate that LLMs markedly overweight inconsistent compared to consistent advice, in a fashion that deviates qualitatively from normative Bayesian updating. Finally, we demonstrate that these two mechanisms -- a drive to maintain consistency with prior commitments and hypersensitivity to contradictory feedback -- parsimoniously capture LLM behavior in a different domain. Together, these findings furnish a mechanistic account of LLM confidence that explains both their stubbornness and excessive sensitivity to criticism.

Problem

Research questions and friction points this paper is trying to address.

LLMs show overconfidence in initial choices

LLMs exhibit excessive doubt under criticism

LLMs deviate from normative Bayesian updating

Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel experimental paradigm for LLM confidence estimates

Analysis of choice-supportive bias in LLMs

Mechanistic account of LLM confidence behaviors

🔎 Similar Papers

Benchmarking Mental State Representations in Language Models