Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing RAG systems employ coarse-grained attribution (at the sentence or paragraph level), leading to information redundancy or omission of critical evidence, thereby undermining output verifiability. To address this, we propose a clause-level fine-grained attribution framework. First, we establish the first clause-level reference annotation standard and corresponding benchmark dataset. Second, we design an LLM-based automated data generation pipeline incorporating a credibility scoring mechanism to filter high-quality samples, supplemented by human verification. Third, we fine-tune retrieval-augmented generation models to produce clause-level attributions that are precise, concise, and informationally sufficient. Experiments demonstrate substantial improvements in attribution accuracy and user verification efficiency, while preserving readability and informational completeness. Our approach advances RAG interpretability and trustworthy reasoning by enabling granular, evidence-grounded justification.

Technology Category

Application Category

📝 Abstract

In retrieval-augmented generation (RAG) question answering systems, generating citations for large language model (LLM) outputs enhances verifiability and helps users identify potential hallucinations. However, we observe two problems in the citations produced by existing attribution methods. First, the citations are typically provided at the sentence or even paragraph level. Long sentences or paragraphs may include a substantial amount of irrelevant content. Second, sentence-level citations may omit information that is essential for verifying the output, forcing users to read the surrounding context. In this paper, we propose generating sub-sentence citations that are both concise and sufficient, thereby reducing the effort required by users to confirm the correctness of the generated output. To this end, we first develop annotation guidelines for such citations and construct a corresponding dataset. Then, we propose an attribution framework for generating citations that adhere to our standards. This framework leverages LLMs to automatically generate fine-tuning data for our task and employs a credit model to filter out low-quality examples. Our experiments on the constructed dataset demonstrate that the propose approach can generate high-quality and more readable citations.

Problem

Research questions and friction points this paper is trying to address.

Citations in RAG systems are too long and include irrelevant content

Sentence-level citations may omit essential verification information

Users need to read excessive context to confirm output correctness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sub-sentence citations for concise verification

LLM-generated fine-tuning data for attribution

Credit model filters low-quality training examples

🔎 Similar Papers

Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation