🤖 AI Summary
This study investigates whether distinct syntactic agreement relations—such as subject–verb, anaphoric, and determiner–noun agreement—share neural representations in large language models (LLMs). Using a functional localization approach, we systematically identify and causally intervene on the most responsive neural units for 67 English syntactic phenomena across seven open-source LLMs; cross-lingual experiments extend to English, Russian, Chinese, and 57 additional languages. Results demonstrate significant overlap in neural unit activation across agreement types; this shared representation pattern exhibits robust cross-linguistic generalization, and the degree of neural sharing correlates positively with typological structural similarity among languages. The work provides the first direct causal evidence that syntactic knowledge in LLMs is encoded in a distributed, structure-sensitive manner—revealing syntactic agreement as a functional category within the model’s representational space.
📝 Abstract
Large language models (LLMs) can reliably distinguish grammatical from ungrammatical sentences, but how grammatical knowledge is represented within the models remains an open question. We investigate whether different syntactic phenomena recruit shared or distinct components in LLMs. Using a functional localization approach inspired by cognitive neuroscience, we identify the LLM units most responsive to 67 English syntactic phenomena in seven open-weight models. These units are consistently recruited across sentences containing the phenomena and causally support the models'syntactic performance. Critically, different types of syntactic agreement (e.g., subject-verb, anaphor, determiner-noun) recruit overlapping sets of units, suggesting that agreement constitutes a meaningful functional category for LLMs. This pattern holds in English, Russian, and Chinese; and further, in a cross-lingual analysis of 57 diverse languages, structurally more similar languages share more units for subject-verb agreement. Taken together, these findings reveal that syntactic agreement-a critical marker of syntactic dependencies-constitutes a meaningful category within LLMs'representational spaces.