🤖 AI Summary
Existing RAG systems predominantly employ paragraph-level coarse-grained attribution, which compromises verifiability in long-document question answering. This work introduces ReClaim, the first framework to enable sentence-level fine-grained attribution, achieving per-sentence traceability through alternating generation of claims and their corresponding citations. Methodologically, ReClaim integrates instruction tuning, autoregressive interleaved decoding, and citation-aware decoding control within a RAG architecture to ensure precise, context-aware attribution. On long-document QA benchmarks, it achieves 90% citation accuracy—substantially outperforming state-of-the-art coarse-grained approaches—and demonstrates robustness and reliability in verifiable generation. The core contribution is the establishment of the first end-to-end verifiable generation paradigm that enforces strict claim–citation alignment, thereby advancing trustworthiness and interpretability in RAG-based systems.
📝 Abstract
Retrieval-Augmented Generation (RAG) has been widely adopted to enhance Large Language Models (LLMs) in knowledge-intensive tasks. To enhance credibility and verifiability in RAG systems, Attributed Text Generation (ATG) is proposed, which provides citations to retrieval knowledge in LLM-generated responses. Prior methods mainly adopt coarse-grained attributions, with passage-level or paragraph-level references or citations, which fall short in verifiability. This paper proposes ReClaim (Refer&Claim), a fine-grained ATG method that alternates the generation of references and answers step by step. Different from previous coarse-grained attribution, ReClaim provides sentence-level citations in long-form question-answering tasks. With extensive experiments, we verify the effectiveness of ReClaim in extensive settings, achieving a citation accuracy rate of 90%.