🤖 AI Summary
This work addresses the challenge of catastrophic forgetting in continual learning when using Low-Rank Adaptation (LoRA), where composite updates disrupt the orthogonality of historical task subspaces, while enforcing strict orthogonality constrains model plasticity and hinders the balance between stability and adaptability. To resolve this, the authors propose a gradient correction mechanism that mathematically decouples LoRA factor updates at the parameter level to preserve orthogonality, alongside a decoupling interval loss that encourages separation between old and new task representations at the feature level. This approach is the first to jointly optimize parameter orthogonality and feature plasticity, achieving state-of-the-art performance across multiple continual learning benchmarks and significantly outperforming existing methods.
📝 Abstract
Low-Rank Adaptation (LoRA) has emerged as a promising paradigm for Continual Learning. It independently updates its low-rank factors ($A$ and $B$), creating a composite update to the full weight matrix through their interaction. To prevent catastrophic forgetting, this update should remain orthogonal to the task-specific subspace that contains previously learned knowledge. However, we identify that this composite update systematically violates this orthogonality, reintroducing interference and undermining stability. Furthermore, naively enforcing this orthogonality compromises plasticity, disrupting the delicate stability-plasticity trade-off. To resolve these issues, we propose \textbf{Janus-LoRA}, a framework that restores this balance through two novel components. Specifically, we first introduce Gradient Rectification, a closed-form solution that mathematically decouples LoRA's factor updates, enforcing orthogonality against the historical knowledge subspace identified by an efficient Online Estimation. Next, to enhance plasticity, we introduce a Decoupled Margin Loss that promotes feature-level separation by pushing new feature representations away from old ones, thus creating distinct, low-interference regions for new learning. Comprehensive experiments on challenging benchmarks demonstrate that by harmonizing parameter-level orthogonality with feature-level separation, Janus-LoRA achieves a superior balance and establishes new state-of-the-art performance.