🤖 AI Summary
This study investigates how large language models align with human neural mechanisms of language at both representational and computational levels. By tracking representational geometric properties—such as entropy and curvature—and fMRI encoding performance throughout the training of the Pythia model series (70M–1B parameters), the work reveals for the first time that training drives the self-organization of internal layers into distinct high- and low-complexity modules. The findings demonstrate that low-complexity modules significantly better predict activity in human language-related brain regions. Temporal lobe alignment emerges rapidly and stabilizes early, whereas frontal lobe alignment exhibits delayed, dynamic development. Notably, reduced curvature serves as a robust indicator of brain alignment across model scales and training stages, with its predictive power strengthening as model size increases.
📝 Abstract
How large language models (LLMs) align with the neural representation and computation of human language is a central question in cognitive science. Using representational geometry as a mechanistic lens, we addressed this by tracking entropy, curvature, and fMRI encoding scores throughout Pythia (70M-1B) training. We identified a geometric modularization where layers self-organize into stable low- and high-complexity clusters. The low-complexity module, characterized by reduced entropy and curvature, consistently better predicted human language network activity. This alignment followed heterogeneous spatial-temporal trajectories: rapid and stable in temporal regions (AntTemp, PostTemp), but delayed and dynamic in frontal areas (IFG, IFGorb). Crucially, reduced curvature remained a robust predictor of model-brain alignment even after controlling for training progress, an effect that strengthened with model scale. These results links training-driven geometric reorganization to temporal-frontal functional specialization, suggesting that representational smoothing facilitates neural-like linguistic processing.