A Unified Assessment of the Poverty of the Stimulus Argument for Neural Language Models

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This study investigates the Poverty of the Stimulus Hypothesis (PoSH) by examining whether neural language models can acquire child-like syntactic generalization abilities without language-specific innate constraints and with only limited linguistic input (10–50 million words). To this end, we introduce poshbench, the first benchmark suite targeting core PoSH phenomena such as question formation and movement islands, and systematically evaluate Transformer models within a unified framework. We further incorporate three cognitive-inspired inductive biases to enhance learning efficiency. Results show that while models exhibit some degree of generalization even in the absence of direct positive evidence, their data efficiency and robustness of generalization remain substantially inferior to those of children. Although the introduced biases improve overall syntactic competence, they do not significantly enhance performance on the critical generalization tasks in poshbench.

Technology Category

Application Category

📝 Abstract

How can children acquire native-level syntax from limited input? According to the Poverty of the Stimulus Hypothesis (PoSH), the linguistic input children receive is insufficient to explain certain generalizations that are robustly learned; innate linguistic constraints, many have argued, are thus necessary to explain language learning. Neural language models, which lack such language-specific constraints in their design, offer a computational test of this longstanding (but controversial) claim. We introduce \poshbench, a training-and-evaluation suite targeting question formation, islands to movement, and other English phenomena at the center of the PoSH arguments. Training Transformer models on 10--50M words of developmentally plausible text, we find indications of generalization on all phenomena even without direct positive evidence -- yet neural models remain less data-efficient and their generalizations are weaker than those of children. We further enhance our models with three recently proposed cognitively motivated inductive biases. We find these biases improve general syntactic competence but not \poshbench performance. Our findings challenge the claim that innate syntax is the only possible route to generalization, while suggesting that human-like data efficiency requires inductive biases beyond those tested here.

Problem

Research questions and friction points this paper is trying to address.

Poverty of the Stimulus

language acquisition

syntactic generalization

inductive biases

neural language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Poverty of the Stimulus

neural language models

poshbench