🤖 AI Summary
Existing language models exhibit limited capacity for lexical acquisition and systematic generalization from few-shot examples (1–3 instances). This paper introduces Minnow, the first meta-training framework for lexical acquisition in-context learning. Minnow enables cross-lexical generalization by modeling semantic abstraction via placeholder tokens during meta-training. It comprises two complementary technical pathways: (i) placeholder-based meta-training followed by generative fine-tuning, and (ii) collaborative optimization combining child-directed speech pretraining with lightweight fine-tuning of large language models. Experiments demonstrate that Minnow achieves lexical acquisition performance on par with vastly larger models—even under extreme data scarcity—yielding substantial improvements across four core tasks: novel-word discrimination, part-of-speech identification, usage generation, and definition generation. To our knowledge, Minnow is the first approach to realize human-like rapid, systematic, and flexible comprehension and generation of novel words.
📝 Abstract
Humans can quickly learn a new word from a few illustrative examples, and then systematically and flexibly use it in novel contexts. Yet the abilities of current language models for few-shot word learning, and methods for improving these abilities, are underexplored. In this study, we introduce a novel method, Meta-training for IN-context learNing Of Words (Minnow). This method trains language models to generate new examples of a word's usage given a few in-context examples, using a special placeholder token to represent the new word. This training is repeated on many new words to develop a general word-learning ability. We find that training models from scratch with Minnow on human-scale child-directed language enables strong few-shot word learning, comparable to a large language model (LLM) pre-trained on orders of magnitude more data. Furthermore, through discriminative and generative evaluations, we demonstrate that finetuning pre-trained LLMs with Minnow improves their ability to discriminate between new words, identify syntactic categories of new words, and generate reasonable new usages and definitions for new words, based on one or a few in-context examples. These findings highlight the data efficiency of Minnow and its potential to improve language model performance in word learning tasks.