🤖 AI Summary
This study addresses a critical gap in current language model evaluation, which predominantly emphasizes grammatical acceptability while neglecting the models’ ability to comprehend the semantic relations encoded by syntactic constructions. Drawing on Construction Grammar theory, the work proposes CxMP—a novel minimal-pair evaluation framework centered on form–meaning pairings—and applies it to nine canonical constructions, including let-alone, caused-motion, and ditransitive patterns. Through controlled semantic contrast tasks, the framework systematically assesses models’ construction-level understanding. Results reveal that although models acquire basic syntactic competence early, their capacity for semantic integration at the construction level lags significantly. Even large-scale models exhibit persistent gaps in mapping form to meaning, underscoring fundamental limitations in their deep semantic comprehension.
📝 Abstract
Recent work has examined language models from a linguistic perspective to better understand how they acquire language. Most existing benchmarks focus on judging grammatical acceptability, whereas the ability to interpret meanings conveyed by grammatical forms has received much less attention. We introduce the Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models (CxMP), a benchmark grounded in Construction Grammar that treats form-meaning pairings, or constructions, as fundamental linguistic units. CxMP evaluates whether models can interpret the semantic relations implied by constructions, using a controlled minimal-pair design across nine construction types, including the let-alone, caused motion, and ditransitive constructions. Our results show that while syntactic competence emerges early, constructional understanding develops more gradually and remains limited even in large language models (LLMs). CxMP thus reveals persistent gaps in how language models integrate form and meaning, providing a framework for studying constructional understanding and learning trajectories in language models.