🤖 AI Summary
In continual learning, models face challenges including parameter explosion, catastrophic forgetting, and poor generalization across tasks or domains. This paper proposes a lightweight low-rank meta-prompting framework that avoids storing historical samples or stacking adapters. Instead, it enables task-aware capacity allocation within a shared low-rank subspace via a synergistic mechanism of dynamic rank selection and dynamic meta-prompting. The method integrates low-rank decomposition, meta-learning, and dynamic sparse activation, supporting both class-incremental and domain-incremental settings while natively accommodating Transformer architectures. Experiments on ImageNet-R, CIFAR-100, CUB-200, and DomainNet demonstrate that the approach maintains constant parameter count, substantially mitigates forgetting, and consistently outperforms LoRA and prompting baselines—particularly exhibiting robust generalization over long task sequences.
📝 Abstract
How to adapt a pre-trained model continuously for sequential tasks with different prediction class labels and domains and finally learn a generalizable model across diverse tasks is a long-lasting challenge. Continual learning (CL) has emerged as a promising approach to leverage pre-trained models (e.g., Transformers) for sequential tasks. While many existing CL methods incrementally store additional learned structures, such as Low-Rank Adaptation (LoRA) adapters or prompts and sometimes even preserve features from previous samples to maintain performance. This leads to unsustainable parameter growth and escalating storage costs as the number of tasks increases. Moreover, current approaches often lack task similarity awareness, which further hinders the models ability to effectively adapt to new tasks without interfering with previously acquired knowledge. To address these challenges, we propose FM-LoRA, a novel and efficient low-rank adaptation method that integrates both a dynamic rank selector (DRS) and dynamic meta-prompting (DMP). This framework allocates model capacity more effectively across tasks by leveraging a shared low-rank subspace critical for preserving knowledge, thereby avoiding continual parameter expansion. Extensive experiments on various CL benchmarks, including ImageNet-R, CIFAR100, and CUB200 for class-incremental learning (CIL), and DomainNet for domain-incremental learning (DIL), with Transformers backbone demonstrate that FM-LoRA effectively mitigates catastrophic forgetting while delivering robust performance across a diverse range of tasks and domains.