Correcting Hallucinations in News Summaries: Exploration of Self-Correcting LLM Methods with External Knowledge

📅 2025-06-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address factual hallucinations in news summarization by large language models (LLMs), this paper proposes an external-knowledge-augmented, multi-turn self-correction framework: it first generates critical verification questions via question-answering, then retrieves evidentiary snippets from three complementary search engines, and iteratively refines the summary. This work pioneers the systematic application of the self-correction paradigm to news summarization, revealing that the quality of retrieved search snippets and the design of few-shot prompts are decisive factors in hallucination mitigation. Experiments demonstrate a significant reduction in hallucination rates and marked improvement in factual accuracy. Both G-Eval automated evaluation and human assessment show strong agreement (Spearman’s ρ > 0.92), confirming the method’s effectiveness and practicality in real-world news summarization scenarios.

Technology Category

Application Category

📝 Abstract
While large language models (LLMs) have shown remarkable capabilities to generate coherent text, they suffer from the issue of hallucinations -- factually inaccurate statements. Among numerous approaches to tackle hallucinations, especially promising are the self-correcting methods. They leverage the multi-turn nature of LLMs to iteratively generate verification questions inquiring additional evidence, answer them with internal or external knowledge, and use that to refine the original response with the new corrections. These methods have been explored for encyclopedic generation, but less so for domains like news summarization. In this work, we investigate two state-of-the-art self-correcting systems by applying them to correct hallucinated summaries using evidence from three search engines. We analyze the results and provide insights into systems' performance, revealing interesting practical findings on the benefits of search engine snippets and few-shot prompts, as well as high alignment of G-Eval and human evaluation.
Problem

Research questions and friction points this paper is trying to address.

Correcting factual inaccuracies in news summaries
Exploring self-correcting LLM methods with external knowledge
Evaluating performance using search engines and human assessment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-correcting LLM methods reduce hallucinations
External knowledge from search engines verifies facts
Few-shot prompts improve summary accuracy
🔎 Similar Papers
No similar papers found.