🤖 AI Summary
This work addresses catastrophic forgetting in class-incremental learning caused by representational drift across network layers. It introduces, for the first time, an analysis from the perspective of inter-layer relational dynamics and proposes a self-correcting low-rank adaptation method. By integrating Low-Rank Adaptation (LoRA) with inter-layer relation matrices, the approach constrains representational drift during new task learning through a singular value alignment mechanism, offering greater robustness compared to conventional element-wise alignment strategies. Evaluated on standard class-incremental benchmarks, the method substantially mitigates forgetting, with performance gains becoming more pronounced as the task sequence lengthens. Notably, it achieves this while maintaining parameter efficiency and effectively stabilizing decision boundaries for previously learned tasks.
📝 Abstract
Pre-trained models with parameter-efficient fine-tuning (PEFT) have demonstrated promising potential for class-incremental learning (CIL), yet catastrophic forgetting still persists when adapting models to new tasks. In this paper, we present a novel perspective on catastrophic forgetting through the analysis of inter-layer relation drift, i.e., the progressive disruption of relationships among layer-wise representations during the learning of new tasks. We theoretically show that the increase of such drift reduces the classification margins of previously learned tasks, thereby degrading overall model performance. To address this issue, we propose \underline{S}elf-\underline{R}ectifying inter-layer \underline{R}elation Low-Rank Adaptation~(SR$^2$-LoRA), a simple yet effective method that mitigates catastrophic forgetting by constraining inter-layer relation drift. Specifically, SR$^2$-LoRA constructs the relation matrices induced by the previous and current models on current-task samples, and aligns the corresponding singular values. We further theoretically show that this alignment exhibits greater robustness to estimation perturbations than direct entry-wise alignment. Extensive experiments on standard CIL benchmarks demonstrate that SR$^2$-LoRA effectively mitigates catastrophic forgetting, with its advantages becoming more pronounced as the number of tasks increases. Code is available in the \href{https://github.com/FqWan24/SR-2-LoRA}{repository}.