🤖 AI Summary
Existing readability metrics predominantly target reading comprehension prediction, neglecting the core dimension of “reading ease”—particularly the facilitative effect of text simplification on reading fluency.
Method: We propose a cognition-driven paradigm grounded in eye-tracking data as the gold standard, systematically evaluating the explanatory power of diverse metrics for reading ease. We compute word-level surprisal using language models (GPT-2/LLaMA) within a linear mixed-effects modeling framework and compare it against classical formulae (e.g., Flesch-Kincaid).
Contribution/Results: Surprisal—an unsupervised, theory-grounded metric—demonstrates significantly higher univariate explanatory power than all traditional metrics (R² improvement up to 37%). Crucially, it exhibits superior consistency and predictive robustness in text simplification scenarios, establishing itself as a more principled and empirically validated indicator of reading ease.
📝 Abstract
Text readability measures are widely used in many real-world scenarios and in NLP. These measures have primarily been developed by predicting reading comprehension outcomes, while largely neglecting what is perhaps the core aspect of a readable text: reading ease. In this work, we propose a new eye tracking based methodology for evaluating readability measures, which focuses on their ability to account for reading facilitation effects in text simplification, as well as for text reading ease more broadly. Using this approach, we find that existing readability formulas are moderate to poor predictors of reading ease. We further find that average per-word length, frequency, and especially surprisal tend to outperform existing readability formulas as measures of reading ease. We thus propose surprisal as a simple unsupervised alternative to existing measures.