🤖 AI Summary
To address hallucination—i.e., the generation of factually unsupported content—in large language model (LLM)-based chatbots, this paper proposes a training-free, plug-and-play post-hoc method. Given a user query, the method first generates an initial response; then retrieves supporting documents via BM25 or embedding-based search; next employs a natural language inference (NLI) model to verify the factual grounding of each claim against the retrieved evidence; and finally iteratively regenerates the response until all claims are fully attributable to cited sources. This work introduces the novel paradigm of “training-agnostic posterior citation augmentation,” enabling fully verifiable and traceable response generation without modifying or fine-tuning the underlying LLM. The method is compatible with any black-box LLM out-of-the-box. Evaluated on three established hallucination benchmarks, it substantially outperforms state-of-the-art approaches, achieving over 8% absolute improvement in both F1 (hallucination detection) and BLEU (response regeneration) scores.
📝 Abstract
Large language models (LLMs) exhibit powerful general intelligence across diverse scenarios, including their integration into chatbots. However, a vital challenge of LLM-based chatbots is that they may produce hallucinated content in responses, which significantly limits their applicability. Various efforts have been made to alleviate hallucination, such as retrieval augmented generation and reinforcement learning with human feedback, but most of them require additional training and data annotation. In this paper, we propose a novel post-hoc Citation-Enhanced Generation (CEG) approach combined with retrieval argumentation. Unlike previous studies that focus on preventing hallucinations during generation, our method addresses this issue in a post-hoc way. It incorporates a retrieval module to search for supporting documents relevant to the generated content, and employs a natural language inference-based citation generation module. Once the statements in the generated content lack of reference, our model can regenerate responses until all statements are supported by citations. Note that our method is a training-free plug-and-play plugin that is capable of various LLMs. Experiments on various hallucination-related datasets show our framework outperforms state-of-the-art methods in both hallucination detection and response regeneration on three benchmarks. Our codes and dataset will be publicly available.