Measuring Form and Function in Language Models

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a novel framework for quantitatively evaluating whether large language models attain the syntactic and pragmatic competence observed in child language acquisition, focusing specifically on English determiners. For the first time, it introduces a dual-dimensional benchmark—integrating both formal syntactic structure and discourse-functional usage—from developmental linguistics into the assessment of large models. The study designs a Contextual Alternatives Choice (CAC) prompting paradigm to probe models’ linguistic knowledge in a targeted manner. Through systematic comparisons with empirical child behavioral data and statistical baselines, the findings reveal that current models of comparable scale generally fail to satisfy both dimensions simultaneously; only certain very large models approach children’s performance, thereby demonstrating both the validity and the challenge of the proposed evaluation framework.
📝 Abstract
We introduce quantitative metrics for child language acquisition to evaluate language models. Our focus is on the formal syntactic and functional discourse properties of determiners in English, which young children acquire early and accurately. We propose Contextual Alternative Choice (CAC), a new prompting method which provides targeted tests for both syntactic and discourse knowledge of language. The method enables direct comparison of language models against children, and more importantly, against statistical benchmarks independently established in empirical research. No current model trained on a comparable amount of data simultaneously meet both formal and functional benchmarks like human children, but some very large models do. We present our results as methodological and technical contributions, with specific emphasis on cognitive status of language models.
Problem

Research questions and friction points this paper is trying to address.

language models
child language acquisition
syntactic properties
discourse properties
determiners
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contextual Alternative Choice
language model evaluation
child language acquisition
syntactic knowledge
discourse function
🔎 Similar Papers
No similar papers found.