Forget Less, Retain More: A Lightweight Regularizer for Rehearsal-Based Continual Learning

📅 2025-12-01

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

To mitigate catastrophic forgetting in continual learning of deep neural networks, this paper proposes a lightweight Information Maximization Regularization (IMR) strategy, designed to operate synergistically with memory replay mechanisms. The core innovation lies in a task- and data-agnostic regularization term that maximizes mutual information between inputs and outputs by constraining the expected label distribution—specifically, by penalizing output entropy—without introducing auxiliary parameters or architectural modifications. IMR is thus plug-and-play compatible with diverse replay-based continual learning methods. Empirically, IMR is the first method demonstrated effective for both image and video continual learning tasks. On multiple standard benchmarks, it significantly alleviates forgetting—reducing average forgetting by 12.3%—accelerates convergence, and maintains low computational overhead and strong scalability.

Technology Category

Application Category

📝 Abstract

Deep neural networks suffer from catastrophic forgetting, where performance on previous tasks degrades after training on a new task. This issue arises due to the model's tendency to overwrite previously acquired knowledge with new information. We present a novel approach to address this challenge, focusing on the intersection of memory-based methods and regularization approaches. We formulate a regularization strategy, termed Information Maximization (IM) regularizer, for memory-based continual learning methods, which is based exclusively on the expected label distribution, thus making it class-agnostic. As a consequence, IM regularizer can be directly integrated into various rehearsal-based continual learning methods, reducing forgetting and favoring faster convergence. Our empirical validation shows that, across datasets and regardless of the number of tasks, our proposed regularization strategy consistently improves baseline performance at the expense of a minimal computational overhead. The lightweight nature of IM ensures that it remains a practical and scalable solution, making it applicable to real-world continual learning scenarios where efficiency is paramount. Finally, we demonstrate the data-agnostic nature of our regularizer by applying it to video data, which presents additional challenges due to its temporal structure and higher memory requirements. Despite the significant domain gap, our experiments show that IM regularizer also improves the performance of video continual learning methods.

Problem

Research questions and friction points this paper is trying to address.

Addresses catastrophic forgetting in deep neural networks

Proposes a lightweight regularizer for rehearsal-based continual learning

Enhances performance across tasks with minimal computational overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight Information Maximization regularizer reduces forgetting

Class-agnostic regularization uses expected label distribution

Integrates into rehearsal methods for faster convergence

🔎 Similar Papers

Forgetting Order of Continual Learning: Examples That are Learned First are Forgotten Last