An Empirical Study of the Anchoring Effect in LLMs: Existence, Mechanism, and Potential Mitigations

πŸ“… 2025-05-21
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper presents the first systematic empirical validation of the cognitive anchoring bias in large language models (LLMs)β€”a systematic deviation in model judgments induced by initial information (β€œanchors”). To investigate this phenomenon, the authors construct SynAnchors, a synthetic benchmark dataset, and introduce a hierarchical attribution analysis framework coupled with an enhanced bias quantification metric. Through multi-model benchmarking and targeted reasoning-path interventions, they demonstrate that anchoring bias is pervasive across LLMs and predominantly manifests in shallow transformer layers; conventional robustness techniques prove ineffective, whereas chain-of-thought reasoning partially mitigates the effect. Key contributions include: (1) establishing the first empirical foundation for anchoring bias in LLMs; (2) proposing a cognitive-bias-aware evaluation paradigm for trustworthy AI; and (3) open-sourcing SynAnchors and its evaluation framework to support reproducible, bias-aware AI research.

Technology Category

Application Category

πŸ“ Abstract
The rise of Large Language Models (LLMs) like ChatGPT has advanced natural language processing, yet concerns about cognitive biases are growing. In this paper, we investigate the anchoring effect, a cognitive bias where the mind relies heavily on the first information as anchors to make affected judgments. We explore whether LLMs are affected by anchoring, the underlying mechanisms, and potential mitigation strategies. To facilitate studies at scale on the anchoring effect, we introduce a new dataset, SynAnchors. Combining refined evaluation metrics, we benchmark current widely used LLMs. Our findings show that LLMs' anchoring bias exists commonly with shallow-layer acting and is not eliminated by conventional strategies, while reasoning can offer some mitigation. This recontextualization via cognitive psychology urges that LLM evaluations focus not on standard benchmarks or over-optimized robustness tests, but on cognitive-bias-aware trustworthy evaluation.
Problem

Research questions and friction points this paper is trying to address.

Investigates anchoring effect existence in LLMs
Explores mechanisms and mitigation for anchoring bias
Proposes cognitive-bias-aware trustworthy LLM evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces SynAnchors dataset for anchoring effect studies
Uses refined metrics to benchmark LLMs' anchoring bias
Proposes reasoning as mitigation for anchoring bias
πŸ”Ž Similar Papers
No similar papers found.
Y
Yiming Huang
The Hong Kong University of Science and Technology, Guangzhou
B
Biquan Bie
Independent Researcher
Z
Zuqiu Na
The Hong Kong University of Science and Technology, Guangzhou
Weilin Ruan
Weilin Ruan
Hong Kong University of Science and Technology (Guangzhou)
Spatio-Temporal Data Mining
Songxin Lei
Songxin Lei
Mphil Student, The Hong Kong University of Science and Technology
Spatio-Temporal Data MiningEdge ComputingReinforcement Learning
Y
Yutao Yue
The Hong Kong University of Science and Technology, Guangzhou
Xinlei He
Xinlei He
Assistant Professor, HKUST(GZ)
Trustworthy Machine LearningSecurityPrivacy