🤖 AI Summary
Traditional surprisal metrics derived from large language models (LLMs) struggle to accurately capture the cognitive load humans experience when processing garden-path sentences. This work proposes a joint latent-variable mixture model that integrates data from four experimental paradigms—eye-tracking, self-paced reading (both moving-window and cumulative), and Maze—within a unified framework. The model explicitly disentangles garden-path probability, path cost, and reanalysis cost, while also accounting for attention-lapse trials. By moving beyond reliance on LLM surprisal alone, the approach successfully replicates key empirical patterns, including regressive eye movements, comprehension question accuracy, and grammaticality judgments. It significantly outperforms surprisal-only baselines in predicting both human reading behavior and task performance.
📝 Abstract
Using temporarily ambiguous garden-path sentences ("While the team trained the striker wondered ...") as a test case, we present a latent-process mixture model of human reading behavior across four different reading paradigms (eye tracking, uni- and bidirectional self-paced reading, Maze). The model distinguishes between garden-path probability, garden-path cost, and reanalysis cost, and yields more realistic processing cost estimates by taking into account trials with inattentive reading. We show that the model is able to reproduce empirical patterns with regard to rereading behavior, comprehension question responses, and grammaticality judgments. Cross-validation reveals that the mixture model also has better predictive fit to human reading patterns and end-of-trial task data than a mixture-free model based on GPT-2-derived surprisal values. We discuss implications for future work.