🤖 AI Summary
This work addresses the challenge of catastrophic forgetting in large language models during continual learning. To mitigate this issue, the authors propose CRAFT, a novel framework that unifies task routing, regularization, and adaptation within a single KL divergence–based objective. CRAFT employs a low-rank intervention mechanism—a variant of LoRA—to enable controlled adaptation in the hidden representation space, clusters tasks via routing based on output distribution discrepancies, and regularizes group-wise prior states using KL divergence, all driven by the same optimization signal. Experimental results demonstrate that CRAFT consistently outperforms strong LoRA-based baselines across diverse benchmarks and model scales, effectively alleviating catastrophic forgetting, enhancing overall performance, and exhibiting robustness to task ordering.
📝 Abstract
Large language models (LLMs) can acquire new capabilities through fine-tuning, but continual adaptation often leads to catastrophic forgetting. We propose CRAFT, a continual learning framework that avoids updating model weights by instead learning low-rank interventions on hidden representations. CRAFT proceeds in three stages: it first routes each task to a group of similar tasks based on output-distribution divergence; it then fine-tunes the model using a Kullback-Leibler (KL) divergence against the group's prior state, which directly controls forgetting and determines convergence; finally, it merges interventions for the updated task into the shared representation using the same KL signal. This design unifies routing, regularization, and merging through a single KL-based objective. CRAFT improves overall performance and reduces forgetting compared to strong LoRA-based approaches across multiple benchmarks and model scales, while remaining robust to task ordering. These results suggest that controlling adaptation in representation space, guided by output-space divergence, provides a scalable and principled approach to continual learning in LLMs.