N-gram-like Language Models Predict Reading Time Best

πŸ“… 2026-03-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Although advanced language models excel at next-word prediction, they perform markedly worse than simple n-gram models in predicting human reading times. This study systematically evaluates the correlation between reading times and model outputs by integrating eye-tracking data, neural language models, and n-gram probabilities. The findings reveal that n-gram–based models significantly outperform complex semantic models such as Transformers in predicting reading times for natural text, suggesting that human reading behavior relies more heavily on local statistical regularities than on deep semantic processing. These results challenge the prevailing assumption that large-scale language models universally capture human-like cognitive processes and offer a new perspective on computational modeling of language comprehension and reading behavior.

Technology Category

Application Category

πŸ“ Abstract
Recent work has found that contemporary language models such as transformers can become so good at next-word prediction that the probabilities they calculate become worse for predicting reading time. In this paper, we propose that this can be explained by reading time being sensitive to simple n-gram statistics rather than the more complex statistics learned by state-of-the-art transformer language models. We demonstrate that the neural language models whose predictions are most correlated with n-gram probability are also those that calculate probabilities that are the most correlated with eye-tracking-based metrics of reading time on naturalistic text.
Problem

Research questions and friction points this paper is trying to address.

reading time
language models
n-gram
transformer
eye-tracking
Innovation

Methods, ideas, or system contributions that make the work stand out.

n-gram
reading time prediction
transformer language models
eye-tracking
cognitive modeling
πŸ”Ž Similar Papers
No similar papers found.
James A. Michaelov
James A. Michaelov
Massachusetts Institute of Technology
Cognitive ScienceLinguistics
R
Roger P. Levy
Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology