Think Before You Attribute: Improving the Performance of LLMs Attribution Systems

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In scientific applications, large language models (LLMs) often suffer from unreliable attribution—such as missing sources, incorrect source assignments, or coarse-grained attribution at the document level—hindering practical deployment. To address this, we propose a sentence-level pre-attribution mechanism: (1) a novel three-class attribution classifier (Attributable/Non-attributable/Needs Correction) that dynamically selects optimal attribution strategies; and (2) tight integration with retrieval-augmented generation (RAG), enabling fine-grained, sentence-level source matching and precise provenance tracing. Our approach significantly reduces retrieval overhead while improving attribution accuracy and traceability precision. Key contributions include: (1) the first sentence-level pre-attribution classification framework; (2) the release of HAGRID—a cleaned, benchmark-quality attribution dataset; (3) an end-to-end open-source attribution system; and (4) state-of-the-art performance across multiple attribution benchmarks.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are increasingly applied in various science domains, yet their broader adoption remains constrained by a critical challenge: the lack of trustworthy, verifiable outputs. Current LLMs often generate answers without reliable source attribution, or worse, with incorrect attributions, posing a barrier to their use in scientific and high-stakes settings, where traceability and accountability are non-negotiable. To be reliable, attribution systems need high accuracy and retrieve data with short lengths, i.e., attribute to a sentence within a document rather than a whole document. We propose a sentence-level pre-attribution step for Retrieve-Augmented Generation (RAG) systems that classify sentences into three categories: not attributable, attributable to a single quote, and attributable to multiple quotes. By separating sentences before attribution, a proper attribution method can be selected for the type of sentence, or the attribution can be skipped altogether. Our results indicate that classifiers are well-suited for this task. In this work, we propose a pre-attribution step to reduce the computational complexity of attribution, provide a clean version of the HAGRID dataset, and provide an end-to-end attribution system that works out of the box.
Problem

Research questions and friction points this paper is trying to address.

LLMs lack trustworthy verifiable outputs with reliable source attribution
Current attribution systems struggle with accuracy and precise sentence-level retrieval
Scientific settings demand traceability and accountability in LLM-generated answers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sentence-level pre-attribution for RAG systems
Classifier-based sentence categorization for attribution
End-to-end system with cleaned HAGRID dataset
🔎 Similar Papers
No similar papers found.