Beyond In-Domain Detection: SpikeScore for Cross-Domain Hallucination Detection

πŸ“… 2026-01-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limited generalization of existing hallucination detection methods in cross-domain settings. It presents the first systematic study of generalizable hallucination detection and reveals that hallucinated dialogues consistently exhibit pronounced spikes in uncertainty during multi-turn interactions. Building on this insight, the authors propose SpikeScore, an unsupervised metric that quantifies uncertainty fluctuations to detect hallucinations across domains. Theoretical analysis establishes the separability of this phenomenon, and a multi-turn dialogue simulation framework is developed for cross-domain evaluation. Extensive experiments demonstrate that SpikeScore significantly outperforms current methods across multiple large language models and benchmarks, achieving superior generalization and detection performance without requiring labeled data.

Technology Category

Application Category

πŸ“ Abstract
Hallucination detection is critical for deploying large language models (LLMs) in real-world applications. Existing hallucination detection methods achieve strong performance when the training and test data come from the same domain, but they suffer from poor cross-domain generalization. In this paper, we study an important yet overlooked problem, termed generalizable hallucination detection (GHD), which aims to train hallucination detectors on data from a single domain while ensuring robust performance across diverse related domains. In studying GHD, we simulate multi-turn dialogues following LLMs'initial response and observe an interesting phenomenon: hallucination-initiated multi-turn dialogues universally exhibit larger uncertainty fluctuations than factual ones across different domains. Based on the phenomenon, we propose a new score SpikeScore, which quantifies abrupt fluctuations in multi-turn dialogues. Through both theoretical analysis and empirical validation, we demonstrate that SpikeScore achieves strong cross-domain separability between hallucinated and non-hallucinated responses. Experiments across multiple LLMs and benchmarks demonstrate that the SpikeScore-based detection method outperforms representative baselines in cross-domain generalization and surpasses advanced generalization-oriented methods, verifying the effectiveness of our method in cross-domain hallucination detection.
Problem

Research questions and friction points this paper is trying to address.

hallucination detection
cross-domain generalization
large language models
generalizable hallucination detection
out-of-domain
Innovation

Methods, ideas, or system contributions that make the work stand out.

SpikeScore
cross-domain hallucination detection
generalizable hallucination detection
uncertainty fluctuation
multi-turn dialogue
πŸ”Ž Similar Papers
No similar papers found.