Streaming Hallucination Detection in Long Chain-of-Thought Reasoning

📅 2026-01-05

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work proposes the first streaming hallucination detection framework for long-chain-of-thought (CoT) reasoning, addressing the challenge that hallucinations in such settings are often subtle, propagate across reasoning steps, and are difficult to detect and localize in real time. The approach models hallucination as a dynamically evolving latent state throughout the reasoning process, leveraging step-level judgments as local observations to construct prefix-accumulated signals that trace the global evolution of hallucinatory behavior. By doing so, the method enables real-time, interpretable monitoring of hallucinations in extended CoT sequences and provides fine-grained evidential support for detected anomalies. This significantly enhances both the timeliness and transparency of hallucination detection, offering a principled and practical solution for improving the reliability of complex reasoning systems.

Technology Category

Application Category

📝 Abstract

Long chain-of-thought (CoT) reasoning improves the performance of large language models, yet hallucinations in such settings often emerge subtly and propagate across reasoning steps. We suggest that hallucination in long CoT reasoning is better understood as an evolving latent state rather than a one-off erroneous event. Accordingly, we treat step-level hallucination judgments as local observations and introduce a cumulative prefix-level hallucination signal that tracks the global evolution of the reasoning state over the entire trajectory. Overall, our approach enables streaming hallucination detection in long CoT reasoning, providing real-time, interpretable evidence.

Problem

Research questions and friction points this paper is trying to address.

hallucination

chain-of-thought reasoning

streaming detection

large language models

reasoning trajectory

Innovation

Methods, ideas, or system contributions that make the work stand out.

streaming hallucination detection

long chain-of-thought reasoning

latent hallucination state