Simple Lifelong Learning Machines

📅 2020-04-27

📈 Citations: 12

✨ Influential: 0

career value

178K/year

🤖 AI Summary

The core challenge in lifelong learning is to mitigate catastrophic forgetting of previously learned tasks while simultaneously enhancing both backward transfer (improved performance on past tasks) and forward transfer (generalization to unseen future tasks). This paper proposes a representation ensembling method that requires neither memory replay, explicit regularization, nor architectural expansion—yet, for the first time, systematically demonstrates natural support for bidirectional transfer. By leveraging multi-task representation sharing and cross-task feature reweighting, the approach operates efficiently under both computationally constrained and unconstrained settings. Extensive evaluation across multimodal benchmarks—including CIFAR-100, Split Mini-ImageNet, Food1K, the 5-dataset suite, and speech-digit recognition—shows consistent and significant improvements over state-of-the-art continual learning methods, achieving stable gains in both forward and backward transfer performance.

📝 Abstract

In lifelong learning, data are used to improve performance not only on the present task, but also on past and future (unencountered) tasks. While typical transfer learning algorithms can improve performance on future tasks, their performance on prior tasks degrades upon learning new tasks (called forgetting). Many recent approaches for continual or lifelong learning have attempted to maintain performance on old tasks given new tasks. But striving to avoid forgetting sets the goal unnecessarily low. The goal of lifelong learning should be to use data to improve performance on both future tasks (forward transfer) and past tasks (backward transfer). In this paper, we show that a simple approach -- representation ensembling -- demonstrates both forward and backward transfer in a variety of simulated and benchmark data scenarios, including tabular, vision (CIFAR-100, 5-dataset, Split Mini-Imagenet, and Food1k), and speech (spoken digit), in contrast to various reference algorithms, which typically failed to transfer either forward or backward, or both. Moreover, our proposed approach can flexibly operate with or without a computational budget.

Problem

Research questions and friction points this paper is trying to address.

Improves performance on past and future tasks simultaneously

Addresses forgetting in lifelong learning with representation ensembling

Works across diverse data types without computational constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Representation ensembling for lifelong learning

Enables forward and backward transfer

Works flexibly with or without budget

🔎 Similar Papers

No similar papers found.

Toyota Research Institute

Los Altos, CA / Cambridge, MA

Machine Learning Engineer (Multiple Positions)

TikTok

$174304 - $259200 per year

San Jose, CA

Authors to Follow