Different types of syntactic agreement recruit the same units within large language models

📅 2025-12-03

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This study investigates whether distinct syntactic agreement relations—such as subject–verb, anaphoric, and determiner–noun agreement—share neural representations in large language models (LLMs). Using a functional localization approach, we systematically identify and causally intervene on the most responsive neural units for 67 English syntactic phenomena across seven open-source LLMs; cross-lingual experiments extend to English, Russian, Chinese, and 57 additional languages. Results demonstrate significant overlap in neural unit activation across agreement types; this shared representation pattern exhibits robust cross-linguistic generalization, and the degree of neural sharing correlates positively with typological structural similarity among languages. The work provides the first direct causal evidence that syntactic knowledge in LLMs is encoded in a distributed, structure-sensitive manner—revealing syntactic agreement as a functional category within the model’s representational space.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) can reliably distinguish grammatical from ungrammatical sentences, but how grammatical knowledge is represented within the models remains an open question. We investigate whether different syntactic phenomena recruit shared or distinct components in LLMs. Using a functional localization approach inspired by cognitive neuroscience, we identify the LLM units most responsive to 67 English syntactic phenomena in seven open-weight models. These units are consistently recruited across sentences containing the phenomena and causally support the models'syntactic performance. Critically, different types of syntactic agreement (e.g., subject-verb, anaphor, determiner-noun) recruit overlapping sets of units, suggesting that agreement constitutes a meaningful functional category for LLMs. This pattern holds in English, Russian, and Chinese; and further, in a cross-lingual analysis of 57 diverse languages, structurally more similar languages share more units for subject-verb agreement. Taken together, these findings reveal that syntactic agreement-a critical marker of syntactic dependencies-constitutes a meaningful category within LLMs'representational spaces.

Problem

Research questions and friction points this paper is trying to address.

Investigates how grammatical knowledge is represented in large language models

Identifies units responsive to syntactic phenomena across multiple languages

Examines if different agreement types share overlapping neural components

Innovation

Methods, ideas, or system contributions that make the work stand out.

Functional localization identifies units responsive to syntactic phenomena

Overlapping units handle different syntactic agreement types

Cross-lingual analysis shows shared units for structurally similar languages

🔎 Similar Papers

No similar papers found.