What Kind of Language is Easy to Language-Model Under Curriculum Learning?

📅 2026-04-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

192K/year
🤖 AI Summary
This study investigates how learning conditions influence language models’ ability to reproduce typological tendencies observed in human languages, with a focus on the interaction between curriculum learning and model inductive biases. Inspired by developmental psychology, it introduces—for the first time—a simplicity-to-complexity curriculum learning paradigm into computational modeling of linguistic typology, systematically evaluating model performance across varying combinations of word order and other structural features. The findings demonstrate that training sequence significantly reshapes a model’s inductive biases, enhancing its preference for syntactic structures commonly found in natural languages. These results underscore the critical role of learning trajectory in shaping typologically plausible linguistic behavior, suggesting that not only model architecture but also training dynamics fundamentally constrain the emergence of human-like language patterns.
📝 Abstract
Many of the thousands of attested languages share common configurations of features, creating a spectrum from typologically very rare (e.g., object-verb-subject word order) or impossible languages to very common combinations of features (e.g., subject-object-verb word order). One central question is under what conditions such typological tendencies can be predicted, and specifically whether the learning bias of language models (LMs) is sufficient to reproduce such patterns. In this study, we add one dimensionality to such analysis -- the learning scenario for LMs -- to explore its interaction with the inductive bias of LMs. Specifically, as a first study, we examine the effect of curriculum learning (CL), as a developmentally motivated learning scenario, i.e., starting with simpler sentences rather than randomly-ordered input. We expand existing LM-based exploration (El-Naggar et al., 2025a,b) with a simple CL variant and find that CL substantially impacts the apparent inductive bias of LMs.
Problem

Research questions and friction points this paper is trying to address.

language modeling
typological tendencies
curriculum learning
inductive bias
language universals
Innovation

Methods, ideas, or system contributions that make the work stand out.

curriculum learning
language models
inductive bias
linguistic typology
learning scenario
🔎 Similar Papers
No similar papers found.