π€ AI Summary
To address the challenge of detecting hallucinations in text generated by black-box large language models (LLMs), this paper proposes a novel hallucination detection method based on future-context sampling. The core innovation lies in leveraging *future-generated context*βi.e., tokens sampled from subsequent positionsβas an explicit detection signal, thereby capturing the persistence of hallucinatory patterns across textual continuations. Our approach constructs a stride-wise consistency analysis framework by sampling and fusing future token sequences, enabling robust comparison with the original output; it is fully compatible with standard decoding strategies (e.g., top-k, nucleus sampling). Extensive experiments across multiple benchmark datasets and sampling configurations demonstrate that the method significantly improves hallucination detection accuracy, achieving an average 12.3% gain in F1 score. Crucially, it operates without requiring access to model internals or parameter fine-tuning, ensuring strong practicality and generalizability across diverse LLMs and generation settings.
π Abstract
Large Language Models (LLMs) are widely used to generate plausible text on online platforms, without revealing the generation process. As users increasingly encounter such black-box outputs, detecting hallucinations has become a critical challenge. To address this challenge, we focus on developing a hallucination detection framework for black-box generators. Motivated by the observation that hallucinations, once introduced, tend to persist, we sample future contexts. The sampled future contexts provide valuable clues for hallucination detection and can be effectively integrated with various sampling-based methods. We extensively demonstrate performance improvements across multiple methods using our proposed sampling approach.