Learning without Isolation: Pathway Protection for Continual Learning

📅 2025-05-24

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

In continual learning, deep neural networks suffer from catastrophic forgetting—rapid degradation of performance on previously learned tasks. Existing parameter-protection methods face critical bottlenecks: storage overhead scaling linearly with the number of tasks and difficulty in precisely identifying task-critical parameters. This paper introduces a paradigm shift: “paths matter more than parameters.” Instead of protecting individual parameters, we model and preserve sparse activation paths that encode knowledge from prior tasks. We formulate model fusion as a graph-matching problem to enable non-isolated, path-level knowledge retention. Our approach integrates sparse activation modeling with adaptive channel allocation, achieving superior forward-task retention on mainstream benchmarks using significantly fewer parameters. It effectively mitigates catastrophic forgetting while simultaneously ensuring knowledge stability and parameter efficiency.

Technology Category

Application Category

📝 Abstract

Deep networks are prone to catastrophic forgetting during sequential task learning, i.e., losing the knowledge about old tasks upon learning new tasks. To this end, continual learning(CL) has emerged, whose existing methods focus mostly on regulating or protecting the parameters associated with the previous tasks. However, parameter protection is often impractical, since the size of parameters for storing the old-task knowledge increases linearly with the number of tasks, otherwise it is hard to preserve the parameters related to the old-task knowledge. In this work, we bring a dual opinion from neuroscience and physics to CL: in the whole networks, the pathways matter more than the parameters when concerning the knowledge acquired from the old tasks. Following this opinion, we propose a novel CL framework, learning without isolation(LwI), where model fusion is formulated as graph matching and the pathways occupied by the old tasks are protected without being isolated. Thanks to the sparsity of activation channels in a deep network, LwI can adaptively allocate available pathways for a new task, realizing pathway protection and addressing catastrophic forgetting in a parameter-efficient manner. Experiments on popular benchmark datasets demonstrate the superiority of the proposed LwI.

Problem

Research questions and friction points this paper is trying to address.

Addresses catastrophic forgetting in deep networks during sequential task learning

Protects pathways instead of parameters to preserve old-task knowledge

Proposes parameter-efficient pathway protection via adaptive allocation and model fusion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Protects old-task pathways via graph matching

Adaptively allocates pathways for new tasks

Addresses forgetting parameter-efficiently via sparsity

🔎 Similar Papers

Leveraging Hierarchical Taxonomies in Prompt-based Continual Learning