Re-Ex: Revising after Explanation Reduces the Factual Errors in LLM Responses

📅 2024-02-27

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

141K/year

🤖 AI Summary

To address factual hallucinations in large language model (LLM)–generated text, this paper proposes a three-stage post-editing framework: (1) retrieving counterevidence via external tools (e.g., search engines or APIs); (2) prompting the LLM to generate a “factual error explanation”—a novel reasoning step that identifies and articulates the root cause of the error; and (3) performing precise correction grounded in this explanation. The method integrates lightweight chain-of-explanation prompting, LLM self-reflective rewriting, and prompt compression to jointly ensure high correction fidelity and substantially reduce computational overhead. Experiments across multiple benchmarks demonstrate that our approach outperforms FacTool, CoVE, and RARR in both error detection and correction accuracy, while reducing inference latency by up to 42% and token consumption by up to 38%.

Technology Category

Application Category

📝 Abstract

Mitigating hallucination issues is a key challenge that must be overcome to reliably deploy large language models (LLMs) in real-world scenarios. Recently, various methods have been proposed to detect and revise factual errors in LLM-generated texts, in order to reduce hallucination. In this paper, we propose Re-Ex, a method for post-editing LLM-generated responses. Re-Ex introduces a novel reasoning step dubbed as the factual error explanation step. Re-Ex revises the initial response of LLMs using 3-steps : first, external tools are used to retrieve the evidences of the factual errors in the initial LLM response; next, LLM is instructed to explain the problematic parts of the response based on the gathered evidence; finally, LLM revises the initial response using the explanations provided in the previous step. In addition to the explanation step, Re-Ex also incorporates new prompting techniques to reduce the token count and inference time required for the response revision process. Compared with existing methods including FacTool, CoVE, and RARR, Re-Ex provides better detection and revision performance with less inference time and fewer tokens in multiple benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Reducing factual errors in LLM-generated responses

Mitigating hallucination issues in large language models

Improving efficiency in detecting and revising factual errors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses external tools for evidence retrieval

Explains errors before revising responses

Optimizes prompts to reduce token usage

🔎 Similar Papers

GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence