Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition

📅 2025-02-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the critical problem that large language models (LLMs) exhibit substantially lower language acquisition efficiency compared to humans. To bridge this gap, we propose modeling and incorporating the developmental trajectory of working memory during human language critical periods into LLM training: memory capacity is strictly constrained initially and dynamically relaxed via an exponential decay schedule over training steps. This constitutes the first developmentally interpretable, dynamic working memory constraint mechanism introduced in language modeling. Our method comprises three core components: dynamic working memory gating, exponential constraint scheduling, and a syntax-focused evaluation protocol. Experiments demonstrate that our model significantly outperforms both unconstrained and statically constrained baselines across multiple syntactic generalization tasks. These results validate the efficacy of developmentally inspired memory regulation in improving both language modeling efficiency and compositional generalization. Moreover, our approach provides computational support for the critical period hypothesis and advances a new paradigm for data-efficient language learning.

Technology Category

Application Category

📝 Abstract
Large language models exhibit general linguistic abilities but significantly differ from humans in their efficiency of language acquisition. This study proposes a method for integrating the developmental characteristics of working memory during the critical period, a stage when human language acquisition is particularly efficient, into language models. The proposed method introduces a mechanism that initially constrains working memory during the early stages of training and gradually relaxes this constraint in an exponential manner as learning progresses. Targeted syntactic evaluation shows that the proposed method outperforms conventional models without memory constraints or with static memory constraints. These findings not only provide new directions for designing data-efficient language models but also offer indirect evidence supporting the underlying mechanisms of the critical period hypothesis in human language acquisition.
Problem

Research questions and friction points this paper is trying to address.

Integrates developmental working memory characteristics
Enhances language model efficiency during critical period
Outperforms models with static or no memory constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates developmental working memory
Gradually relaxes memory constraints
Outperforms static memory models
🔎 Similar Papers
No similar papers found.