The Structural Sources of Verb Meaning Revisited: Large Language Models Display Syntactic Bootstrapping

📅 2025-08-17

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This study investigates whether large language models (LLMs) exhibit syntactic bootstrapping—the cognitive mechanism by which children infer verb semantics from syntactic distribution—particularly for psychological verbs. Method: Using RoBERTa and GPT-2, the authors conduct ablation experiments on datasets with systematically perturbed syntactic structure or lexical co-occurrence statistics, quantifying the relative reliance of verb and noun representations on syntactic versus distributional cues. Contribution/Results: Results show that verb representations degrade significantly under syntactic perturbation, whereas noun representations depend more strongly on co-occurrence statistics. This provides the first empirical validation of the syntactic bootstrapping hypothesis in large-scale pretrained language models. Moreover, the study demonstrates that controlled training environments can rigorously test developmental linguistic theories, bridging computational modeling and cognitive linguistics through a novel experimental paradigm. The findings underscore syntax as a critical inductive bias for verb learning in LLMs, aligning with psycholinguistic accounts of early language acquisition.

Technology Category

Application Category

📝 Abstract

Syntactic bootstrapping (Gleitman, 1990) is the hypothesis that children use the syntactic environments in which a verb occurs to learn its meaning. In this paper, we examine whether large language models exhibit a similar behavior. We do this by training RoBERTa and GPT-2 on perturbed datasets where syntactic information is ablated. Our results show that models' verb representation degrades more when syntactic cues are removed than when co-occurrence information is removed. Furthermore, the representation of mental verbs, for which syntactic bootstrapping has been shown to be particularly crucial in human verb learning, is more negatively impacted in such training regimes than physical verbs. In contrast, models' representation of nouns is affected more when co-occurrences are distorted than when syntax is distorted. In addition to reinforcing the important role of syntactic bootstrapping in verb learning, our results demonstrated the viability of testing developmental hypotheses on a larger scale through manipulating the learning environments of large language models.

Problem

Research questions and friction points this paper is trying to address.

Testing if large language models use syntax for verb meaning

Examining how syntactic ablation affects verb representation degradation

Comparing differential impacts on mental versus physical verbs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training models on syntactically perturbed datasets

Comparing syntactic versus co-occurrence information ablation

Testing developmental hypotheses via manipulated learning environments

🔎 Similar Papers

No similar papers found.