🤖 AI Summary
This work addresses the instability of existing large language model (LLM)-based summarization ranking strategies and the inadequacy of conventional metrics like ROUGE in evaluating high-quality summaries. To overcome these limitations, the authors propose SCURank, a novel framework that leverages Summary Content Units (SCUs)—information-centric semantic units—as the foundation for ranking candidate summaries. By assessing both information richness and semantic importance through SCU extraction and semantic ranking, SCURank circumvents unreliable direct LLM comparisons and superficial n-gram overlap metrics. The approach further exploits diverse summaries generated by multiple LLMs and enhances smaller models’ summarization capabilities via knowledge distillation. Experimental results demonstrate that SCURank consistently outperforms state-of-the-art methods across multiple datasets and evaluation dimensions, significantly improving the abstractive quality and overall performance of distilled summarization models.
📝 Abstract
Small language models (SLMs), such as BART, can achieve summarization performance comparable to large language models (LLMs) via distillation. However, existing LLM-based ranking strategies for summary candidates suffer from instability, while classical metrics (e.g., ROUGE) are insufficient to rank high-quality summaries. To address these issues, we introduce \textbf{SCURank}, a framework that enhances summarization by leveraging \textbf{Summary Content Units (SCUs)}. Instead of relying on unstable comparisons or surface-level overlap, SCURank evaluates summaries based on the richness and semantic importance of information content. We investigate the effectiveness of SCURank in distilling summaries from multiple diverse LLMs. Experimental results demonstrate that SCURank outperforms traditional metrics and LLM-based ranking methods across evaluation measures and datasets. Furthermore, our findings show that incorporating diverse LLM summaries enhances model abstractiveness and overall distilled model performance, validating the benefits of information-centric ranking in multi-LLM distillation. The code for SCURank is available at https://github.com/IKMLab/SCURank.