Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from hallucination, and while chain-of-thought (CoT) prompting reduces hallucination incidence by 12–28%, it concurrently degrades the performance of mainstream hallucination detection methods—reducing F1 scores by 9–34%. Method: This work empirically investigates how CoT—across zero-shot, few-shot, and self-consistency variants—affects hallucination characteristics in both instruction-tuned and inference-optimized LLMs, analyzing shifts in hallucination score distributions, detection accuracy, and output confidence. Contribution/Results: We reveal that CoT fundamentally compromises detectability by smoothing anomalous confidence peaks via distortion of internal model states and output probability distributions—thereby improving reasoning quality at the expense of diagnostic signal fidelity. This establishes an inherent trade-off between reasoning enhancement and hallucination detectability. Our findings provide critical insights for trustworthy LLM evaluation and robust detection method design. Code is publicly available.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) often exhibit extit{hallucinations}, generating factually incorrect or semantically irrelevant content in response to prompts. Chain-of-Thought (CoT) prompting can mitigate hallucinations by encouraging step-by-step reasoning, but its impact on hallucination detection remains underexplored. To bridge this gap, we conduct a systematic empirical evaluation. We begin with a pilot experiment, revealing that CoT reasoning significantly affects the LLM's internal states and token probability distributions. Building on this, we evaluate the impact of various CoT prompting methods on mainstream hallucination detection methods across both instruction-tuned and reasoning-oriented LLMs. Specifically, we examine three key dimensions: changes in hallucination score distributions, variations in detection accuracy, and shifts in detection confidence. Our findings show that while CoT prompting helps reduce hallucination frequency, it also tends to obscure critical signals used for detection, impairing the effectiveness of various detection methods. Our study highlights an overlooked trade-off in the use of reasoning. Code is publicly available at: https://anonymous.4open.science/r/cot-hallu-detect.
Problem

Research questions and friction points this paper is trying to address.

CoT prompting affects hallucination detection in LLMs
CoT obscures signals for detecting hallucinations
Trade-off between CoT reasoning and detection accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates CoT prompting's impact on hallucination detection
Analyzes internal states and token probability changes
Identifies trade-off between reasoning and detection signals
🔎 Similar Papers
No similar papers found.
J
Jiahao Cheng
East China Normal University, Shanghai, China
T
Tiancheng Su
East China Normal University, Shanghai, China
Jia Yuan
Jia Yuan
University of Macau
G
Guoxiu He
East China Normal University, Shanghai, China
J
Jiawei Liu
Wuhan University, Wuhan, China
X
Xinqi Tao
Xiaohsongshu Inc., Shanghai, China
J
Jingwen Xie
Xiaohsongshu Inc., Shanghai, China
H
Huaxia Li
Xiaohsongshu Inc., Shanghai, China