Copy-Paste to Mitigate Large Language Model Hallucinations

📅 2025-10-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) in retrieval-augmented generation (RAG) often neglect retrieved context, leading to hallucinations and reduced faithfulness. To address this, we propose CopyPasteLLM, a method employing two-stage high-copy-response preference training to explicitly steer models toward reusing provided context. Our key contributions are: (1) an empirical finding that response copy degree strongly and negatively correlates with hallucination; (2) the Context-Parameter Copying Capturing (CPCC) algorithm, which reveals the dynamic calibration mechanism between internal knowledge and external context; and (3) an automated data-generation pipeline coupled with three copy-enhanced prompting strategies. Evaluated on FaithEval and related benchmarks, CopyPasteLLM achieves 12.2–24.5% absolute accuracy gains using only 365 training samples—just 2% of the baseline’s sample count—outperforming state-of-the-art methods significantly.

Technology Category

Application Category

📝 Abstract
While Retrieval-Augmented Generation (RAG) enables large language models (LLMs) to generate contextually grounded responses, contextual faithfulness remains challenging as LLMs may not consistently trust provided context, leading to hallucinations that undermine reliability. We observe an inverse correlation between response copying degree and context-unfaithful hallucinations on RAGTruth, suggesting that higher copying degrees reduce hallucinations by fostering genuine contextual belief. We propose CopyPasteLLM, obtained through two-stage high-copying response preference training. We design three prompting methods to enhance copying degree, demonstrating that high-copying responses achieve superior contextual faithfulness and hallucination control. These approaches enable a fully automated pipeline that transforms generated responses into high-copying preference data for training CopyPasteLLM. On FaithEval, ConFiQA and PubMedQA, CopyPasteLLM achieves best performance in both counterfactual and original contexts, remarkably with 12.2% to 24.5% accuracy improvements on FaithEval over the best baseline, while requiring only 365 training samples -- 1/50th of baseline data. To elucidate CopyPasteLLM's effectiveness, we propose the Context-Parameter Copying Capturing algorithm. Interestingly, this reveals that CopyPasteLLM recalibrates reliance on internal parametric knowledge rather than external knowledge during generation. All codes are available at https://github.com/longyongchao/CopyPasteLLM
Problem

Research questions and friction points this paper is trying to address.

Mitigating hallucinations in LLMs through copying responses
Enhancing contextual faithfulness in retrieval-augmented generation
Reducing reliance on internal knowledge via copy-paste training
Innovation

Methods, ideas, or system contributions that make the work stand out.

CopyPasteLLM uses high-copying response preference training
Prompting methods enhance copying degree to reduce hallucinations
Recalibrates reliance on internal knowledge over external context
🔎 Similar Papers
No similar papers found.
Yongchao Long
Yongchao Long
Department of Computer Science, Tianjin University of Technology, Tianjin, China.
Medical Large Language Models
X
Xian Wu
Tencent Jarvis Lab, Shenzhen, China
Y
Yingying Zhang
Tencent Jarvis Lab, Shenzhen, China
X
Xianbin Wen
Department of Computer Science, Tianjin University of Technology, Tianjin, China
Y
Yuxi Zhou
Department of Computer Science, Tianjin University of Technology, Tianjin, China
Shenda Hong
Shenda Hong
Assistant Professor, Peking University
AI ECGBiosignalAI for Digital HealthHealth Data ScienceAI for Healthcare