BabyLM's First Constructions: Causal interventions provide a signal of learning

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the open question of whether pre-trained language models (PLMs) possess construction-learning capacity under developmentally plausible conditions—specifically, when trained on input scales approximating infant linguistic exposure. Method: We systematically evaluate BabyLM models—lightweight PLMs trained on developmentally constrained corpora—on their ability to encode form–meaning mappings across diverse syntactic constructions, including surface-similar and semantically sensitive high-difficulty cases. We employ Rozner et al.’s causal intervention analysis framework and a construction-sensitive evaluation protocol. Contribution/Results: All BabyLM models robustly encode multiple constructions; construction encoding strength correlates significantly with downstream task performance. This is the first empirical demonstration that PLMs can acquire constructional knowledge under developmentally realistic data constraints, revealing both structural validity and functional relevance of their grammatical representations.

Technology Category

Application Category

📝 Abstract

Construction grammar posits that children acquire constructions (form-meaning pairings) from the statistics of their environment. Recent work supports this hypothesis by showing sensitivity to constructions in pretrained language models (PLMs), including one recent study (Rozner et al., 2025) demonstrating that constructions shape the PLM's output distribution. However, models under study have generally been trained on developmentally implausible amounts of data, casting doubt on their relevance to human language learning. Here we use Rozner et al.'s methods to evaluate constructional learning in models from the 2024 BabyLM challenge. Our results show that even when trained on developmentally plausible quantities of data, models represent diverse constructions, even hard cases that are superficially indistinguishable. We further find correlational evidence that constructional performance may be functionally relevant: models that better represent constructions perform better on the BabyLM benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Evaluating constructional learning in models with plausible data

Assessing relevance of models to human language acquisition

Linking construction representation to benchmark performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal interventions signal learning in models

Construction grammar applied to BabyLM challenge

Models perform better with accurate constructions

🔎 Similar Papers

From Pre-training Corpora to Large Language Models: What Factors Influence LLM Performance in Causal Discovery Tasks?