Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation

📅 2024-07-01
🏛️ North American Chapter of the Association for Computational Linguistics
📈 Citations: 7
Influential: 0
📄 PDF
🤖 AI Summary
Existing RAG systems predominantly employ paragraph-level coarse-grained attribution, which compromises verifiability in long-document question answering. This work introduces ReClaim, the first framework to enable sentence-level fine-grained attribution, achieving per-sentence traceability through alternating generation of claims and their corresponding citations. Methodologically, ReClaim integrates instruction tuning, autoregressive interleaved decoding, and citation-aware decoding control within a RAG architecture to ensure precise, context-aware attribution. On long-document QA benchmarks, it achieves 90% citation accuracy—substantially outperforming state-of-the-art coarse-grained approaches—and demonstrates robustness and reliability in verifiable generation. The core contribution is the establishment of the first end-to-end verifiable generation paradigm that enforces strict claim–citation alignment, thereby advancing trustworthiness and interpretability in RAG-based systems.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) has been widely adopted to enhance Large Language Models (LLMs) in knowledge-intensive tasks. To enhance credibility and verifiability in RAG systems, Attributed Text Generation (ATG) is proposed, which provides citations to retrieval knowledge in LLM-generated responses. Prior methods mainly adopt coarse-grained attributions, with passage-level or paragraph-level references or citations, which fall short in verifiability. This paper proposes ReClaim (Refer&Claim), a fine-grained ATG method that alternates the generation of references and answers step by step. Different from previous coarse-grained attribution, ReClaim provides sentence-level citations in long-form question-answering tasks. With extensive experiments, we verify the effectiveness of ReClaim in extensive settings, achieving a citation accuracy rate of 90%.
Problem

Research questions and friction points this paper is trying to address.

Enhancing credibility in Retrieval-Augmented LLMs with fine-grained citations
Improving verifiability by providing sentence-level references in responses
Addressing limitations of coarse-grained attributions in knowledge-intensive tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained ATG method for citations
Interleaved reference-claim generation
Sentence-level citations in QA tasks
🔎 Similar Papers
No similar papers found.
S
Sirui Xia
School of Computer Science, Fudan University
X
Xintao Wang
School of Computer Science, Fudan University
Jiaqing Liang
Jiaqing Liang
Fudan University
knowledge graphdeep learning
Y
Yifei Zhang
School of Computer Science, Fudan University
W
Weikang Zhou
AntGroup
J
Jiaji Deng
AntGroup
F
Fei Yu
AntGroup
Y
Yanghua Xiao
School of Computer Science, Fudan University