🤖 AI Summary
This paper identifies “pseudo-forgetting” in continual learning of large language models (LLMs): performance degradation stems not from actual knowledge loss, but from ineffective instruction-triggered activation of pre-existing capabilities—particularly during chain-of-thought (CoT) generation.
Method: We introduce Rationale-Guidance Difficulty (RGD), a novel metric that quantifies and empirically validates the reversibility of forgetting. We design task-agnostic instruction prefixes that reactivate dormant task-specific capabilities without parameter updates, and optimize replay data allocation to prioritize high-RGD samples. Our approach synergistically integrates CoT-aware instruction engineering with a replay framework.
Results: The method substantially mitigates catastrophic forgetting across multiple LLMs, simultaneously enhancing both stability and plasticity. Average reasoning recovery rate improves by 32.7%, demonstrating robust and generalizable mitigation of pseudo-forgetting.
📝 Abstract
Although substantial efforts have been made to mitigate catastrophic forgetting in continual learning, the intrinsic mechanisms are not well understood. In this paper, we discover that when a forgetting model passively receives an externally provided partial appropriate rationale, its performance on the forgotten task can be restored. Furthermore, by simply adding a task-agnostic prefix to the original instruction, the forgetting model can actively generate an appropriate rationale to reach the correct answer. These findings suggest that the model does not actually ``forget'' the task knowledge; instead, the degraded performance can be attributed to the failure of the original instructions in guiding the model to generate the appropriate rationales. Based on this insight, we propose the Rationale-Guidance Difficulty metric to evaluate how effectively a given instruction guides the model in generating appropriate rationales. We apply this metric to optimize the allocation of replay data in replay-based continual learning algorithm. Experimental results demonstrate that our data allocation method effectively mitigates catastrophic forgetting and maintains better model plasticity simultaneously across models.