Anything Goes? A Crosslinguistic Study of (Im)possible Language Learning in LMs

📅 2025-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether large language models (LLMs) exhibit human-like linguistic inductive biases—specifically, their capacity to distinguish attested natural languages from “impossible” languages defined by Greenberg’s Universal 20 and unattested word orders. Method: We conduct a systematic perplexity evaluation on GPT-2 small across 12 natural languages spanning four major language families. Controlled syntactic manipulations, guided by Greenberg’s Universal 20, generate counterfactual word orders to construct contrastive experiments. Contribution/Results: LLMs show partial discrimination between attested and impossible languages but fail to achieve full separation. Crucially, perplexity does not systematically vary with typological attestation, indicating only a weak—but statistically measurable—human-like inductive bias. This study provides the first cross-family empirical evidence for the linguistic cognitive foundations of LLMs.

Technology Category

Application Category

📝 Abstract
Do LLMs offer insights into human language learning? A common argument against this idea is that because their architecture and training paradigm are so vastly different from humans, LLMs can learn arbitrary inputs as easily as natural languages. In this paper, we test this claim by training LMs to model impossible and typologically unattested languages. Unlike previous work, which has focused exclusively on English, we conduct experiments on 12 natural languages from 4 language families. Our results show that while GPT-2 small can primarily distinguish attested languages from their impossible counterparts, it does not achieve perfect separation between all the attested languages and all the impossible ones. We further test whether GPT-2 small distinguishes typologically attested from unattested languages with different NP orders by manipulating word order based on Greenberg's Universal 20. We find that the model's perplexity scores do not distinguish attested vs. unattested word orders, as long as the unattested variants maintain constituency structure. These findings suggest that language models exhibit some human-like inductive biases, though these biases are weaker than those found in human learners.
Problem

Research questions and friction points this paper is trying to address.

LLMs' ability to model impossible languages
Comparison of LLMs and human language learning
Impact of typological variation on LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPT-2 small tests impossible languages
Experiments on 12 diverse natural languages
Analyzes word order impact on perplexity
🔎 Similar Papers
No similar papers found.