PMF-CL: Pareto-Minimal-Forgetting Continual Learner for Conflicting Tasks

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

221K/year
🤖 AI Summary
This work addresses catastrophic forgetting in continual learning caused by task conflicts, particularly in realistic scenarios where no global common optimum exists. It introduces Pareto optimality into continual learning for the first time, proposing a “Pareto-minimal forgetting” principle. Leveraging a Pareto optimization approach based on quadratic upper-bound loss functions, the framework supports linear models, basis function regression, and logistic regression, while enabling efficient iterative updates with only O(d²) static memory overhead. Experimental results demonstrate that the method significantly mitigates forgetting on sequences of conflicting tasks and achieves near-theoretically-optimal performance retention with low memory cost.
📝 Abstract
In the literature, many continual learning (CL) algorithms have been proposed to address the issue of catastrophic forgetting in ML models (i.e., learning new tasks leads to the loss of performance on previously learned tasks). Although all CL approaches use some form of memory to retain information about past tasks, a grounded understanding of what information needs to be stored to minimize catastrophic forgetting remains elusive. Recently, it has been recognized that under the strong assumption of the existence of a common global minimizer over all tasks, catastrophic forgetting can be completely avoided. However, in practice, tasks rarely have a common global minimizer, and a certain amount of forgetting is inevitable. In this paper, we propose a foundational framework for principled and systematic CL of conflicting tasks using a multi-task learning (MTL) perspective. The approach is based on finding Pareto-optimal solutions, i.e., the solutions which, by definition, minimally forget the previous tasks in the Pareto sense. We derive Pareto-minimal-forgetting CL algorithms for linear and basis-function regression, and general loss functions which have a quadratic upper bound, e.g., logistic regression. For quadratic problems, PMF-CL uses memory-efficient iterative updates with a static memory footage of $\mathcal{O}(d^2)$ for models with $d$ parameters.
Problem

Research questions and friction points this paper is trying to address.

catastrophic forgetting
continual learning
conflicting tasks
Pareto optimality
multi-task learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pareto-optimal
continual learning
catastrophic forgetting
multi-task learning
memory-efficient
🔎 Similar Papers