🤖 AI Summary
Efficient token acquisition from scratch remains a fundamental challenge for language models. Method: We propose the Teacher–Try–Demonstrate (TnD) interactive learning framework—the first to adapt human infant interactive learning paradigms to language model training. TnD employs a dual-model student–teacher architecture, integrating reinforcement learning–driven active token trials, high-fidelity teacher demonstrations, a staged linguistic competence reward mechanism, and a controllable interactive environment. Contribution/Results: We empirically demonstrate that synergistic active trial and high-quality demonstration significantly accelerate token acquisition; reveal a strong correlation between token frequency and learning curves; and establish causal effects of demonstration quality and trial frequency on learning efficiency. Experiments show that TnD achieves substantially faster token acquisition than baselines—under equal or smaller parameter budgets—offering a novel, interpretable, and sample-efficient paradigm for language acquisition.
📝 Abstract
Humans are efficient language learners and inherently social creatures. Our language development is largely shaped by our social interactions, for example, the demonstration and feedback from caregivers. Contrary to human language learning, recent advancements in large language models have primarily adopted a non-interactive training paradigm, and refined pre-trained models through feedback afterward. In this work, we explore how corrective feedback from interactions influences neural language acquisition from scratch through systematically controlled experiments, assessing whether it contributes to word learning efficiency in language models. We introduce a trial-and-demonstration (TnD) learning framework that incorporates three distinct components: student trials, teacher demonstrations, and a reward conditioned on language competence at various developmental stages. Our experiments reveal that the TnD approach accelerates word acquisition for student models of equal and smaller numbers of parameters, and we highlight the significance of both trials and demonstrations. We further show that the teacher's choices of words influence students' word-specific learning efficiency, and a practice-makes-perfect effect is evident by a strong correlation between the frequency of words in trials and their respective learning curves. Our findings suggest that interactive language learning, with teacher demonstrations and active trials, can facilitate efficient word learning in language models.