Published in Journal of Memory and Language (2025): 'Dissociable frequency effects attenuate as large language model surprisal predictors improve'.
Published at EACL 2024: 'Frequency explains the inverse correlation of large language models’ size, training data amount, and surprisal’s fit to reading times'.
Published at ACL 2023: 'Token-wise decomposition of autoregressive language model hidden states for analyzing model predictions'.
Published in TACL 2023: 'Why does surprisal from larger Transformer-based language models provide a poorer fit to human reading times?'.
Published at EMNLP 2022: 'Entropy- and distance-based predictors from GPT-2 attention patterns predict reading times over and above GPT-2 surprisal'.