Sparse Orthogonal Parameters Tuning for Continual Learning

📅 2024-11-05
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To mitigate catastrophic forgetting in pretrained models during continual learning, this paper proposes Sparse Orthogonal Tuning (SoTU), a parameter-efficient adaptation method. SoTU freezes the backbone network and introduces lightweight, sparse, and orthogonally constrained delta parameters—updated incrementally per task. These parameters are fused across tasks via orthogonal projection and sparsity-aware optimization. Crucially, SoTU is the first approach to jointly integrate sparsity and orthogonality into the parameter update mechanism of continual learning, replacing conventional adapter- or prompt-based fine-tuning. Evaluated on multiple standard continual learning benchmarks, SoTU achieves state-of-the-art feature representation performance in a plug-and-play, retraining-free manner—requiring no task-specific classifier design. It significantly outperforms leading adapter and prompting methods while offering superior generalization and deployment efficiency.

Technology Category

Application Category

📝 Abstract
Continual learning methods based on pre-trained models (PTM) have recently gained attention which adapt to successive downstream tasks without catastrophic forgetting. These methods typically refrain from updating the pre-trained parameters and instead employ additional adapters, prompts, and classifiers. In this paper, we from a novel perspective investigate the benefit of sparse orthogonal parameters for continual learning. We found that merging sparse orthogonality of models learned from multiple streaming tasks has great potential in addressing catastrophic forgetting. Leveraging this insight, we propose a novel yet effective method called SoTU (Sparse Orthogonal Parameters TUning). We hypothesize that the effectiveness of SoTU lies in the transformation of knowledge learned from multiple domains into the fusion of orthogonal delta parameters. Experimental evaluations on diverse CL benchmarks demonstrate the effectiveness of the proposed approach. Notably, SoTU achieves optimal feature representation for streaming data without necessitating complex classifier designs, making it a Plug-and-Play solution.
Problem

Research questions and friction points this paper is trying to address.

Addressing catastrophic forgetting in continual learning
Exploring sparse orthogonal parameters for task adaptation
Enabling plug-and-play feature representation without complex classifiers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse orthogonal parameters for continual learning
Orthogonal delta parameters fusion
Plug-and-Play solution without complex classifiers
🔎 Similar Papers
No similar papers found.
Kun-Peng Ning
Kun-Peng Ning
Peking University
Machine LearningLLMs
H
Hai-Jian Ke
School of Electronic and Computer Engineering, Peking University, Shenzhen, China
Y
Yu-Yang Liu
School of Electronic and Computer Engineering, Peking University, Shenzhen, China
J
Jia-Yu Yao
School of Electronic and Computer Engineering, Peking University, Shenzhen, China
Y
Yonghong Tian
School of Electronic and Computer Engineering, Peking University, Shenzhen, China
Li Yuan
Li Yuan
Research Associate, University of Science & Technology of China (USTC)
Antibiotic resistanceWastewater treatmentEnvironmental bioremediationAnaerobic digestionFate of organic pollutants