🤖 AI Summary
This study addresses the challenge of automating the development of general-purpose capabilities in language models through curriculum learning. It proposes a "cognitive training" framework that treats language models as agents and leverages cross-entropy games to generate task curricula, iteratively expanding their skill boundaries via greedy optimization combined with a meta-sampling mechanism. The core contribution lies in the first formalization of the meta-objective for building general capabilities, whose uniqueness—under natural assumptions and involving only a few hyperparameters—is rigorously established, thereby providing a theoretical foundation for curriculum learning. This work identifies cognitive training as a viable pathway toward autonomous, continuous discovery and evolution of skills in language models.
📝 Abstract
Defining a constructive process to build general capabilities for language models in an automatic manner is considered an open problem in artificial intelligence. Towards this, we consider the problem of building a curriculum of tasks that grows a model via relevant skill discovery. We provide a concrete framework for this task, using a family of tasks called cross-entropy games, which we postulate is universal in a suitable sense. We show that if it is possible to grow the curriculum for relevant skill discovery by iterating a greedy optimization algorithm, then, under natural assumptions, there is essentially only one meta-objective possible (up to a few hyperparameters). We call the resulting process cognitive training. We postulate that, given sufficiently capable language models as players and meta-samplers and sufficient training time, cognitive training provides a principled way to relevant skill discovery; and hence to the extent general capabilities are achievable via greedy curriculum learning, cognitive training would be a solution.