Testing the Predictions of Surprisal Theory in 11 Languages

📅 2023-07-07
🏛️ Transactions of the Association for Computational Linguistics
📈 Citations: 57
Influential: 5
📄 PDF
🤖 AI Summary
Surprisal Theory has long been validated primarily in English native reading, limiting claims of its cross-linguistic universality. Method: This study systematically tests the theory across 11 typologically diverse languages spanning five major language families. Using monolingual and multilingual pretrained language models, we compute word-level surprisal and contextual entropy, then conduct hierarchical regression analyses against multilingual eye-tracking and reading-time data. Contribution/Results: We find robust linear relationships between surprisal and reading time, with contextual entropy exhibiting significant independent predictive power. Crucially, all three core theoretical predictions—(i) surprisal predicts reading time, (ii) entropy is predictive, and (iii) the surprisal–reading-time relationship is linear—are highly significant across all 11 languages. This establishes the broadest and most robust cross-linguistic link to date between information-theoretic measures and incremental language comprehension, providing the strongest multilingual empirical support for Surprisal Theory.
📝 Abstract
Abstract Surprisal theory posits that less-predictable words should take more time to process, with word predictability quantified as surprisal, i.e., negative log probability in context. While evidence supporting the predictions of surprisal theory has been replicated widely, much of it has focused on a very narrow slice of data: native English speakers reading English texts. Indeed, no comprehensive multilingual analysis exists. We address this gap in the current literature by investigating the relationship between surprisal and reading times in eleven different languages, distributed across five language families. Deriving estimates from language models trained on monolingual and multilingual corpora, we test three predictions associated with surprisal theory: (i) whether surprisal is predictive of reading times, (ii) whether expected surprisal, i.e., contextual entropy, is predictive of reading times, and (iii) whether the linking function between surprisal and reading times is linear. We find that all three predictions are borne out crosslinguistically. By focusing on a more diverse set of languages, we argue that these results offer the most robust link to date between information theory and incremental language processing across languages.
Problem

Research questions and friction points this paper is trying to address.

Tests Surprisal Theory in 11 diverse languages
Examines surprisal-reading time relationship crosslinguistically
Validates three key predictions of Surprisal Theory
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual surprisal analysis across 11 languages
Monolingual and multilingual language model training
Testing surprisal theory predictions crosslinguistically
🔎 Similar Papers
No similar papers found.