π€ AI Summary
This work addresses the gap between cognitive modeling and language modeling by organizing the fourth BabyLM Challenge and Workshop, with a focus on data-efficient pretraining to advance language models that are both cognitively plausible and computationally efficient. The initiative introduces the first multilingual track, extending BabyLM to cross-linguistic settings, and integrates research directions such as cognitively inspired architectures, weak-model evaluation, and training efficiency optimization. By fostering the development of low-resource, interpretable models grounded in human-like learning mechanisms, this effort significantly strengthens the interdisciplinary synergy between cognitive science and artificial intelligence, thereby catalyzing the emergence of a new research community dedicated to building cognitively grounded language models.
π Abstract
BabyLM aims to dissolve the boundaries between cognitive modeling and language modeling. We call for both workshop papers and for researchers to join the 4th BabyLM competition. As in previous years, we call for participants in the data-efficient pretraining challenge in the general track. This year, we also offer a new track: Multilingual.
We also call for papers outside the competition in any relevant areas. These include training efficiency, cognitively plausible research, weak model evaluation, and more.