Language Models as Artificial Learners: Investigating Crosslinguistic Influence

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This study addresses the inconsistent empirical findings in human bilingualism research on cross-linguistic influence (CLI), often attributed to inadequate experimental control. For the first time, it employs language models as controlled statistical learners to systematically manipulate first-language (L1) dominance, age of second-language (L2) acquisition, and L1–L2 syntactic distance. Integrating cross-linguistic priming paradigms with neural representational analyses, the work disentangles the mechanisms driving CLI. Results demonstrate that language dominance and proficiency are key predictors of CLI; syntactic structures exhibit bidirectional priming, whereas non-syntactic structures show priming modulated by language dominance. Crucially, the study provides the first neural representational evidence of co-activation of L1 during L2 processing, offering converging computational and neurocognitive support for theoretical accounts of cross-linguistic influence.

Technology Category

Application Category

📝 Abstract

Despite the centrality of crosslinguistic influence (CLI) to bilingualism research, human studies often yield conflicting results due to inherent experimental variance. We address these inconsistencies by using language models (LMs) as controlled statistical learners to systematically simulate CLI and isolate its underlying drivers. Specifically, we study the effect of varying the L1 language dominance and the L2 language proficiency, which we manipulate by controlling the L2 age of exposure -- defined as the training step at which the L2 is introduced. Furthermore, we investigate the impact of pretraining on L1 languages with varying syntactic distance from the L2. Using cross-linguistic priming, we analyze how activating L1 structures impacts L2 processing. Our results align with evidence from psycholinguistic studies, confirming that language dominance and proficiency are strong predictors of CLI. We further find that while priming of grammatical structures is bidirectional, the priming of ungrammatical structures is sensitive to language dominance. Finally, we provide mechanistic evidence of CLI in LMs, demonstrating that the L1 is co-activated during L2 processing and directly influences the neural circuitry recruited for the L2. More broadly, our work demonstrates that LMs can serve as a computational framework to inform theories of human CLI.

Problem

Research questions and friction points this paper is trying to address.

crosslinguistic influence

bilingualism

language dominance

L2 proficiency

experimental variance

Innovation

Methods, ideas, or system contributions that make the work stand out.

language models

crosslinguistic influence

language dominance