🤖 AI Summary
This work investigates the implicit information salience mechanisms underlying large language models’ (LLMs) text summarization behavior. Addressing the opacity of LLMs’ internal salience preferences, we propose the first behaviorally interpretable salience quantification framework: it jointly models length-controllable summarization with Questions Under Discussion (QUD) answerability tracking to systematically uncover models’ information selection biases. Empirical evaluation across 13 LLMs and 4 benchmark datasets reveals a consistent, architecture- and scale-invariant hierarchical salience pattern—one that is introspectively inaccessible to the models themselves, only weakly aligned with human intuition, and not directly recoverable from internal signals such as attention weights or gradients. Our study establishes a novel paradigm for probing LLM summarization cognition and provides a reproducible, behavior-based analytical toolkit for salience assessment.
📝 Abstract
Large Language Models (LLMs) excel at text summarization, a task that requires models to select content based on its importance. However, the exact notion of salience that LLMs have internalized remains unclear. To bridge this gap, we introduce an explainable framework to systematically derive and investigate information salience in LLMs through their summarization behavior. Using length-controlled summarization as a behavioral probe into the content selection process, and tracing the answerability of Questions Under Discussion throughout, we derive a proxy for how models prioritize information. Our experiments on 13 models across four datasets reveal that LLMs have a nuanced, hierarchical notion of salience, generally consistent across model families and sizes. While models show highly consistent behavior and hence salience patterns, this notion of salience cannot be accessed through introspection, and only weakly correlates with human perceptions of information salience.