Improving Factuality with Explicit Working Memory

📅 2024-12-24

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Large language models (LLMs) frequently generate factual hallucinations in long-text generation, severely undermining output reliability. To address this, we propose Explicit Working Memory (EWE), the first dynamic memory architecture supporting online updates during text generation. EWE integrates retrieval-augmented feedback and multi-granularity factual verification results in real time, enabling immediate correction of erroneous statements via rule-driven memory unit updates. Unlike conventional retrieval-augmented generation (RAG) approaches—which rely on static prompting and iterative post-hoc refinement—EWE establishes a closed-loop, synergistic interaction between generation and verification. Evaluated on four fact-oriented long-text benchmarks, our method improves VeriScore by 2–6 points, significantly enhancing factual consistency while preserving response utility.

Technology Category

Application Category

📝 Abstract

Large language models can generate factually inaccurate content, a problem known as hallucination. Recent works have built upon retrieved-augmented generation to improve factuality through iterative prompting but these methods are limited by the traditional RAG design. To address these challenges, we introduce EWE (Explicit Working Memory), a novel approach that enhances factuality in long-form text generation by integrating a working memory that receives real-time feedback from external resources. The memory is refreshed based on online fact-checking and retrieval feedback, allowing EWE to rectify false claims during the generation process and ensure more accurate and reliable outputs. Our experiments demonstrate that Ewe outperforms strong baselines on four fact-seeking long-form generation datasets, increasing the factuality metric, VeriScore, by 2 to 6 points absolute without sacrificing the helpfulness of the responses. Further analysis reveals that the design of rules for memory updates, configurations of memory units, and the quality of the retrieval datastore are crucial factors for influencing model performance.

Problem

Research questions and friction points this paper is trying to address.

Enhancing factuality in long-form text generation

Reducing hallucination in language models

Integrating real-time feedback for accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates real-time feedback mechanism

Enhances factuality with explicit memory

Improves retrieval-augmented generation efficiency

🔎 Similar Papers

No similar papers found.