🤖 AI Summary
This study addresses the computationally verifiable core of the Distributional Learning Hypothesis—whether linguistic constructions (form-meaning pairings) are implicitly encoded in word co-occurrence distributions, particularly given the limitation of raw text in resolving counterfactual semantic distinctions.
Method: Leveraging RoBERTa as a computationally tractable proxy for natural language distribution, we propose a construction-sensitive statistical affinity metric and systematically evaluate its discriminative capacity across challenging construction pairs.
Contribution/Results: We provide the first empirical evidence that pretrained language models’ internal statistical affinities robustly distinguish semantically divergent yet surface-similar constructions (e.g., *bǎ*-constructions vs. dispositional constructions) and abstract slot-and-frame patterns (e.g., “X yī Y jiù Z”). Such affinities constitute a necessary—but insufficient—cue for construction acquisition. Our hybrid qualitative–quantitative construction identification framework achieves significant performance on multiple ambiguous construction types while explicitly delineating the inherent boundaries of purely distributional approaches.
📝 Abstract
Construction grammar posits that constructions (form-meaning pairings) are acquired through experience with language (the distributional learning hypothesis). But how much information about constructions does this distribution actually contain? Corpus-based analyses provide some answers, but text alone cannot answer counterfactual questions about what caused a particular word to occur. For that, we need computable models of the distribution over strings -- namely, pretrained language models (PLMs). Here we treat a RoBERTa model as a proxy for this distribution and hypothesize that constructions will be revealed within it as patterns of statistical affinity. We support this hypothesis experimentally: many constructions are robustly distinguished, including (i) hard cases where semantically distinct constructions are superficially similar, as well as (ii) schematic constructions, whose"slots"can be filled by abstract word classes. Despite this success, we also provide qualitative evidence that statistical affinity alone may be insufficient to identify all constructions from text. Thus, statistical affinity is likely an important, but partial, signal available to learners.