Surprisal from Larger Transformer-based Language Models Predicts fMRI Data More Poorly

📅 2025-06-12

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

It remains unclear whether larger Transformer language models exhibit diminished predictivity of human neural responses—specifically, fMRI-measured voxel-wise activity—to word-level surprisal. Method: We constructed voxel-wise linear encoding models using word-level surprisal from 17 cross-linguistic pre-trained models and evaluated their goodness-of-fit on two independent fMRI datasets. Contribution/Results: We report the first evidence that increasing model parameter count and training data volume significantly degrade fMRI prediction performance—a robust “stronger-but-harder-to-fit” effect replicated across languages and datasets. Crucially, model perplexity exhibits a significant positive correlation with fMRI encoding accuracy. This extends a previously observed counterintuitive behavioral pattern—where stronger models yield poorer fits to reading-time data—to the neuroimaging domain, revealing a fundamental tension between language model capability and neural interpretability.

Technology Category

Application Category

📝 Abstract

As Transformers become more widely incorporated into natural language processing tasks, there has been considerable interest in using surprisal from these models as predictors of human sentence processing difficulty. Recent work has observed a positive relationship between Transformer-based models' perplexity and the predictive power of their surprisal estimates on reading times, showing that language models with more parameters and trained on more data are less predictive of human reading times. However, these studies focus on predicting latency-based measures (i.e., self-paced reading times and eye-gaze durations) with surprisal estimates from Transformer-based language models. This trend has not been tested on brain imaging data. This study therefore evaluates the predictive power of surprisal estimates from 17 pre-trained Transformer-based models across three different language families on two functional magnetic resonance imaging datasets. Results show that the positive relationship between model perplexity and model fit still obtains, suggesting that this trend is not specific to latency-based measures and can be generalized to neural measures.

Problem

Research questions and friction points this paper is trying to address.

Larger Transformer models predict fMRI data less accurately

Surprisal estimates from bigger models poorly match brain activity

Model perplexity negatively correlates with neural measure prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Larger Transformer models predict fMRI poorly

Surprisal estimates tested on brain imaging data

Model perplexity inversely relates to neural fit

🔎 Similar Papers

No similar papers found.