Different types of syntactic agreement recruit the same units within large language models

📅 2025-12-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether distinct syntactic agreement relations—such as subject–verb, anaphoric, and determiner–noun agreement—share neural representations in large language models (LLMs). Using a functional localization approach, we systematically identify and causally intervene on the most responsive neural units for 67 English syntactic phenomena across seven open-source LLMs; cross-lingual experiments extend to English, Russian, Chinese, and 57 additional languages. Results demonstrate significant overlap in neural unit activation across agreement types; this shared representation pattern exhibits robust cross-linguistic generalization, and the degree of neural sharing correlates positively with typological structural similarity among languages. The work provides the first direct causal evidence that syntactic knowledge in LLMs is encoded in a distributed, structure-sensitive manner—revealing syntactic agreement as a functional category within the model’s representational space.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) can reliably distinguish grammatical from ungrammatical sentences, but how grammatical knowledge is represented within the models remains an open question. We investigate whether different syntactic phenomena recruit shared or distinct components in LLMs. Using a functional localization approach inspired by cognitive neuroscience, we identify the LLM units most responsive to 67 English syntactic phenomena in seven open-weight models. These units are consistently recruited across sentences containing the phenomena and causally support the models'syntactic performance. Critically, different types of syntactic agreement (e.g., subject-verb, anaphor, determiner-noun) recruit overlapping sets of units, suggesting that agreement constitutes a meaningful functional category for LLMs. This pattern holds in English, Russian, and Chinese; and further, in a cross-lingual analysis of 57 diverse languages, structurally more similar languages share more units for subject-verb agreement. Taken together, these findings reveal that syntactic agreement-a critical marker of syntactic dependencies-constitutes a meaningful category within LLMs'representational spaces.
Problem

Research questions and friction points this paper is trying to address.

Investigates how grammatical knowledge is represented in large language models
Identifies units responsive to syntactic phenomena across multiple languages
Examines if different agreement types share overlapping neural components
Innovation

Methods, ideas, or system contributions that make the work stand out.

Functional localization identifies units responsive to syntactic phenomena
Overlapping units handle different syntactic agreement types
Cross-lingual analysis shows shared units for structurally similar languages
🔎 Similar Papers
No similar papers found.
D
Daria Kryvosheieva
Massachusetts Institute of Technology
Andrea Gregor de Varda
Andrea Gregor de Varda
Massachusetts Institute of Technology
E
Evelina Fedorenko
Massachusetts Institute of Technology
Greta Tuckute
Greta Tuckute
Post-doc, Brain and Cognitive Sciences, MIT
Cognitive neuroscienceartificial intelligence