🤖 AI Summary
This study investigates the implicit mechanisms by which large language models (LLMs) assess information importance during abstractive summarization, a process that typically lacks transparency. By generating length-controlled summaries to construct empirical importance distributions and combining attention head alignment analysis with cross-layer predictive capability evaluation, the work systematically reveals consistent and model-family-specific importance patterns in LLMs. The findings demonstrate that LLMs’ importance judgments differ markedly from those of traditional models, that consistency within a model family outweighs the influence of model scale, and that specific attention heads in middle-to-late layers effectively predict salient content. This research is the first to localize neural components aligned with importance judgment, establishing a foundation for interpretable understanding of LLM-based summarization mechanisms.
📝 Abstract
Large Language Models (LLMs) are now state-of-the-art at summarization, yet the internal notion of importance that drives their information selections remains hidden. We propose to investigate this by combining behavioral and computational analyses. Behaviorally, we generate a series of length-controlled summaries for each document and derive empirical importance distributions based on how often each information unit is selected. These reveal that LLMs converge on consistent importance patterns, sharply different from pre-LLM baselines, and that LLMs cluster more by family than by size. Computationally, we identify that certain attention heads align well with empirical importance distributions, and that middle-to-late layers are strongly predictive of importance. Together, these results provide initial insights into what LLMs prioritize in summarization and how this priority is internally represented, opening a path toward interpreting and ultimately controlling information selection in these models.