TROVE: A Challenge for Fine-Grained Text Provenance via Source Sentence Tracing and Relationship Classification

๐Ÿ“… 2025-03-19
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper addresses the lack of fine-grained provenance tracing for large language model (LLM)-generated text in high-stakes domains such as healthcare and law. We formally define and systematically model sentence-level textual provenance: localizing the original source sentence and classifying the semantic relationship between target and sourceโ€”namely, quotation, compression, inference, or paraphrase. To support this task, we introduce TROVE, a high-quality, cross-lingual, multi-document, long-context benchmark featuring a four-category relation annotation schema and a three-stage hybrid annotation pipeline (GPT-assisted pre-screening, human expert refinement, and retrieval-augmented validation). Evaluating 11 LLMs via both direct prompting and RAG-based methods, we demonstrate that retrieval is critical for accurate provenance tracing. While closed-source models generally outperform open-source ones, the latter achieve substantial gains when augmented with RAG.

Technology Category

Application Category

๐Ÿ“ Abstract
LLMs have achieved remarkable fluency and coherence in text generation, yet their widespread adoption has raised concerns about content reliability and accountability. In high-stakes domains such as healthcare, law, and news, it is crucial to understand where and how the content is created. To address this, we introduce the Text pROVEnance (TROVE) challenge, designed to trace each sentence of a target text back to specific source sentences within potentially lengthy or multi-document inputs. Beyond identifying sources, TROVE annotates the fine-grained relationships (quotation, compression, inference, and others), providing a deep understanding of how each target sentence is formed. To benchmark TROVE, we construct our dataset by leveraging three public datasets covering 11 diverse scenarios (e.g., QA and summarization) in English and Chinese, spanning source texts of varying lengths (0-5k, 5-10k, 10k+), emphasizing the multi-document and long-document settings essential for provenance. To ensure high-quality data, we employ a three-stage annotation process: sentence retrieval, GPT provenance, and human provenance. We evaluate 11 LLMs under direct prompting and retrieval-augmented paradigms, revealing that retrieval is essential for robust performance, larger models perform better in complex relationship classification, and closed-source models often lead, yet open-source models show significant promise, particularly with retrieval augmentation.
Problem

Research questions and friction points this paper is trying to address.

Trace sentences to specific source sentences in multi-document inputs.
Classify fine-grained relationships between target and source sentences.
Evaluate LLMs on provenance tasks in diverse scenarios and languages.
Innovation

Methods, ideas, or system contributions that make the work stand out.

TROVE traces sentences to specific sources
Annotates fine-grained relationships between sentences
Uses three-stage annotation for high-quality data
๐Ÿ”Ž Similar Papers
No similar papers found.
Junnan Zhu
Junnan Zhu
Institute of Automation Chinese Academy of Sciences
Natural Language Processing
Min Xiao
Min Xiao
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
Y
Yining Wang
Unisound AI Technology Co.Ltd
Feifei Zhai
Feifei Zhai
Institute of Automation, Chinese Academy of Sciences
Machine TranslationNatural Language ProcessingMachine Learning
Y
Yu Zhou
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, Beijing, China
C
Chengqing Zong
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, CAS, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China