Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance

📅 2025-02-02

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

This study systematically evaluates the performance of 19 small language models (SLMs) on news summarization under resource-constrained settings, benchmarking against 70B-parameter large language models (LLMs) across summary quality, coherence, factual consistency, and length efficiency. Method: Using a 2,000-sample news corpus, we employ a multi-dimensional evaluation framework combining human assessment with automated metrics—ROUGE, BERTScore, and fact-checking—and conduct controlled prompt experiments, instruction-tuning ablations, and computational complexity analysis. Contribution/Results: We empirically demonstrate that top-performing SLMs (e.g., Phi3-Mini) match 70B LLMs in summary quality while reducing output length by 35% on average. Simpler prompts outperform complex instructions, and instruction tuning yields no significant gains for news summarization. This work establishes, for the first time, the practical efficacy boundary and efficient deployment pathway for SLMs in news summarization.

Technology Category

Application Category

📝 Abstract

The increasing demand for efficient summarization tools in resource-constrained environments highlights the need for effective solutions. While large language models (LLMs) deliver superior summarization quality, their high computational resource requirements limit practical use applications. In contrast, small language models (SLMs) present a more accessible alternative, capable of real-time summarization on edge devices. However, their summarization capabilities and comparative performance against LLMs remain underexplored. This paper addresses this gap by presenting a comprehensive evaluation of 19 SLMs for news summarization across 2,000 news samples, focusing on relevance, coherence, factual consistency, and summary length. Our findings reveal significant variations in SLM performance, with top-performing models such as Phi3-Mini and Llama3.2-3B-Ins achieving results comparable to those of 70B LLMs while generating more concise summaries. Notably, SLMs are better suited for simple prompts, as overly complex prompts may lead to a decline in summary quality. Additionally, our analysis indicates that instruction tuning does not consistently enhance the news summarization capabilities of SLMs. This research not only contributes to the understanding of SLMs but also provides practical insights for researchers seeking efficient summarization solutions that balance performance and resource use.

Problem

Research questions and friction points this paper is trying to address.

Small Language Models

Text Summarization

Resource-constrained Environment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Small Language Models

Text Summarization

Resource Efficiency

🔎 Similar Papers

A Large Language Model Guided Topic Refinement Mechanism for Short Text Modeling

2024-03-26Citations: 2

💼 Related Jobs

Senior Applied Research Scientist

AMD

Bellevue, WA, USA / San Jose, CA, USA

Authors to Follow