🤖 AI Summary
To address catastrophic forgetting (CF) in continual learning and the limitations of hand-crafted algorithm design, this paper proposes the Self-Referential Neural Network (SRNN), a meta-learning framework that automatically synthesizes in-context continual learning strategies. SRNN integrates task embeddings, dynamic weight generation, and differentiable optimization to mitigate forgetting without experience replay. Its core contribution is the first realization of *algorithm self-generation*: during training, the network autonomously evolves task-adaptive continual learning mechanisms—effectively learning *how to learn continuously*. On replay-free Split-MNIST, SRNN significantly outperforms state-of-the-art hand-designed methods. Moreover, it demonstrates strong cross-task generalization and unified applicability across diverse benchmarks, including few-shot and standard image classification datasets (e.g., CIFAR-100, Mini-ImageNet). This work advances continual learning by shifting from manual strategy engineering to end-to-end, self-referential policy synthesis.
📝 Abstract
General-purpose learning systems should improve themselves in open-ended fashion in ever-changing environments. Conventional learning algorithms for neural networks, however, suffer from catastrophic forgetting (CF) -- previously acquired skills are forgotten when a new task is learned. Instead of hand-crafting new algorithms for avoiding CF, we propose Automated Continual Learning (ACL) to train self-referential neural networks to meta-learn their own in-context continual (meta-)learning algorithms. ACL encodes all desiderata -- good performance on both old and new tasks -- into its meta-learning objectives. Our experiments demonstrate that ACL effectively solves"in-context catastrophic forgetting"; our ACL-learned algorithms outperform hand-crafted ones, e.g., on the Split-MNIST benchmark in the replay-free setting, and enables continual learning of diverse tasks consisting of multiple few-shot and standard image classification datasets.