ARC: Argument Representation and Coverage Analysis for Zero-Shot Long Document Summarization with Instruction Following LLMs

📅 2025-05-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses zero-shot summarization of long documents in legal and scientific domains, focusing on preserving key argumentative structures—such as claims, evidence, and counterarguments—in generated summaries. To this end, we propose the Argument Representation Coverage (ARC) evaluation framework, the first to formalize argument-role coverage as a quantifiable, attributable metric for summary quality. Leveraging open-source LLMs—including Llama, Qwen, and Phi-3—we integrate structured argument annotation with coverage computation. Our analysis reveals systematic deficiencies in current LLMs: pronounced positional bias (loss of arguments appearing toward document ends) and role-based preference (systematic underrepresentation of supportive arguments), exacerbated under sparse argument distribution. ARC establishes a reproducible, interpretable paradigm for assessing structural fidelity in long-document summarization, enabling fine-grained diagnosis of argument preservation capabilities beyond surface-level fluency or informativeness.

Technology Category

Application Category

📝 Abstract
Integrating structured information has long improved the quality of abstractive summarization, particularly in retaining salient content. In this work, we focus on a specific form of structure: argument roles, which are crucial for summarizing documents in high-stakes domains such as law. We investigate whether instruction-tuned large language models (LLMs) adequately preserve this information. To this end, we introduce Argument Representation Coverage (ARC), a framework for measuring how well LLM-generated summaries capture salient arguments. Using ARC, we analyze summaries produced by three open-weight LLMs in two domains where argument roles are central: long legal opinions and scientific articles. Our results show that while LLMs cover salient argument roles to some extent, critical information is often omitted in generated summaries, particularly when arguments are sparsely distributed throughout the input. Further, we use ARC to uncover behavioral patterns -- specifically, how the positional bias of LLM context windows and role-specific preferences impact the coverage of key arguments in generated summaries, emphasizing the need for more argument-aware summarization strategies.
Problem

Research questions and friction points this paper is trying to address.

Measuring argument role coverage in LLM summaries
Assessing LLM performance on legal and scientific documents
Identifying positional bias in argument retention by LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates argument roles for summarization quality
Measures argument coverage with ARC framework
Analyzes LLM positional bias in summaries
🔎 Similar Papers
No similar papers found.