SCURank: Ranking Multiple Candidate Summaries with Summary Content Units for Enhanced Summarization

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

136K/year

🤖 AI Summary

This work addresses the instability of existing large language model (LLM)-based summarization ranking strategies and the inadequacy of conventional metrics like ROUGE in evaluating high-quality summaries. To overcome these limitations, the authors propose SCURank, a novel framework that leverages Summary Content Units (SCUs)—information-centric semantic units—as the foundation for ranking candidate summaries. By assessing both information richness and semantic importance through SCU extraction and semantic ranking, SCURank circumvents unreliable direct LLM comparisons and superficial n-gram overlap metrics. The approach further exploits diverse summaries generated by multiple LLMs and enhances smaller models’ summarization capabilities via knowledge distillation. Experimental results demonstrate that SCURank consistently outperforms state-of-the-art methods across multiple datasets and evaluation dimensions, significantly improving the abstractive quality and overall performance of distilled summarization models.

Technology Category

Application Category

📝 Abstract

Small language models (SLMs), such as BART, can achieve summarization performance comparable to large language models (LLMs) via distillation. However, existing LLM-based ranking strategies for summary candidates suffer from instability, while classical metrics (e.g., ROUGE) are insufficient to rank high-quality summaries. To address these issues, we introduce \textbf{SCURank}, a framework that enhances summarization by leveraging \textbf{Summary Content Units (SCUs)}. Instead of relying on unstable comparisons or surface-level overlap, SCURank evaluates summaries based on the richness and semantic importance of information content. We investigate the effectiveness of SCURank in distilling summaries from multiple diverse LLMs. Experimental results demonstrate that SCURank outperforms traditional metrics and LLM-based ranking methods across evaluation measures and datasets. Furthermore, our findings show that incorporating diverse LLM summaries enhances model abstractiveness and overall distilled model performance, validating the benefits of information-centric ranking in multi-LLM distillation. The code for SCURank is available at https://github.com/IKMLab/SCURank.

Problem

Research questions and friction points this paper is trying to address.

summary ranking

large language models

distillation

evaluation metrics

instability

Innovation

Methods, ideas, or system contributions that make the work stand out.

SCURank

Summary Content Units

multi-LLM distillation