CARPAS: Towards Content-Aware Refinement of Provided Aspects for Summarization in Large Language Models

๐Ÿ“… 2025-10-08
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In real-world scenarios, predefined summarization aspects are often incomplete, irrelevant, or entirely absent, leading to misaligned summaries that deviate from document content. To address this, we propose a content-aware dynamic aspect adjustment mechanism: first, a novel aspect cardinality prediction subtask guides large language models (LLMs) to focus on salient aspects while reducing inference complexity; second, aspects are adaptively filtered based on the predicted count to retain only those strongly relevant to the input document. Our method integrates LLMs with four carefully designed prompting strategies and is evaluated on three newly constructed datasets. Experiments demonstrate significant improvements in summary quality and substantial mitigation of over-generation, outperforming all baselines across ROUGE scores and factual consistency metrics. Notably, we empirically uncoverโ€” for the first timeโ€”LLMsโ€™ strong adherence to numerical instructions, establishing a new paradigm for aspect-driven controllable summarization.

Technology Category

Application Category

๐Ÿ“ Abstract
Aspect-based summarization has attracted significant attention for its ability to generate more fine-grained and user-aligned summaries. While most existing approaches assume a set of predefined aspects as input, real-world scenarios often present challenges where these given aspects may be incomplete, irrelevant, or entirely missing from the document. Users frequently expect systems to adaptively refine or filter the provided aspects based on the actual content. In this paper, we initiate this novel task setting, termed Content-Aware Refinement of Provided Aspects for Summarization (CARPAS), with the aim of dynamically adjusting the provided aspects based on the document context before summarizing. We construct three new datasets to facilitate our pilot experiments, and by using LLMs with four representative prompting strategies in this task, we find that LLMs tend to predict an overly comprehensive set of aspects, which often results in excessively long and misaligned summaries. Building on this observation, we propose a preliminary subtask to predict the number of relevant aspects, and demonstrate that the predicted number can serve as effective guidance for the LLMs, reducing the inference difficulty, and enabling them to focus on the most pertinent aspects. Our extensive experiments show that the proposed approach significantly improves performance across all datasets. Moreover, our deeper analyses uncover LLMs' compliance when the requested number of aspects differs from their own estimations, establishing a crucial insight for the deployment of LLMs in similar real-world applications.
Problem

Research questions and friction points this paper is trying to address.

Refining incomplete or irrelevant predefined aspects for summarization
Dynamically adjusting aspects based on document content before summarizing
Reducing overly comprehensive aspect predictions in large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Content-aware refinement of provided aspects
Predicting number of relevant aspects
Using predicted number to guide LLMs
๐Ÿ”Ž Similar Papers
No similar papers found.