Surprisal and Metaphor Novelty Judgments: Moderate Correlations and Divergent Scaling Effects Revealed by Corpus-Based and Synthetic Datasets

📅 2026-01-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether language model surprisal can effectively capture metaphorical novelty and how this relationship varies across data types. By systematically evaluating cloze-style surprisal values from 16 language models of varying scales and architectures against human-rated novelty scores on both corpus-derived and synthetically generated metaphor datasets, the work reveals a moderate correlation between surprisal and perceived novelty for the first time. Notably, in naturally occurring corpora, this correlation weakens as model scale increases, whereas in synthetic data it strengthens—demonstrating opposing scaling trends. These findings suggest that surprisal, while informative, exhibits context-dependent limitations as a proxy for linguistic creativity and should be interpreted with caution depending on the nature of the underlying data.

Technology Category

Application Category

📝 Abstract
Novel metaphor comprehension involves complex semantic processes and linguistic creativity, making it an interesting task for studying language models (LMs). This study investigates whether surprisal, a probabilistic measure of predictability in LMs, correlates with annotations of metaphor novelty in different datasets. We analyse the surprisal of metaphoric words in corpus-based and synthetic metaphor datasets using 16 causal LM variants. We propose a cloze-style surprisal method that conditions on full-sentence context. Results show that LM surprisal yields significant moderate correlations with scores/labels of metaphor novelty. We further identify divergent scaling patterns: on corpus-based data, correlation strength decreases with model size (inverse scaling effect), whereas on synthetic data it increases (quality-power hypothesis). We conclude that while surprisal can partially account for annotations of metaphor novelty, it remains limited as a metric of linguistic creativity. Code and data are publicly available: https://github.com/OmarMomen14/surprisal-metaphor-novelty
Problem

Research questions and friction points this paper is trying to address.

surprisal
metaphor novelty
language models
linguistic creativity
scaling effects
Innovation

Methods, ideas, or system contributions that make the work stand out.

surprisal
metaphor novelty
scaling effects
language models
cloze-style evaluation
O
Omar Momen
CRC 1646 – Linguistic Creativity in Communication, Faculty of Linguistics and Literary Studies, Bielefeld University, Germany
E
Emilie Sitter
CRC 1646 – Linguistic Creativity in Communication, Faculty of Linguistics and Literary Studies, Bielefeld University, Germany
B
Berenike Herrmann
CRC 1646 – Linguistic Creativity in Communication, Faculty of Linguistics and Literary Studies, Bielefeld University, Germany
Sina Zarrieß
Sina Zarrieß
Professor for Computational Linguistics, Bielefeld University
Computational linguisticsmachine learninglanguage generationdialogue computational semantics