๐ค AI Summary
To address catastrophic forgettingโa pervasive issue in parameter-efficient fine-tuning (PEFT) of large language models (LLMs)โthis paper proposes a low-damage knowledge implantation method. Our approach is the first to integrate mechanistic interpretability insights into LLM internal knowledge storage directly into PEFT design. Grounded in analysis of Transformer knowledge representation mechanisms, it combines low-rank adaptation with a knowledge-anchoring injection strategy to explicitly preserve general-purpose capabilities while adapting to downstream tasks. Experiments across multiple LLMs and realistic scenarios demonstrate that our method matches the task performance of full fine-tuning and LoRA, while significantly outperforming existing PEFT methods in retaining general capabilities. This achieves a superior trade-off between task-specific adaptation and generalization preservation.
๐ Abstract
Fine-tuning adapts pretrained models for specific tasks but poses the risk of catastrophic forgetting (CF), where critical knowledge from pre-training is overwritten. Current Parameter-Efficient Fine-Tuning (PEFT) methods for Large Language Models (LLMs), while efficient, often sacrifice general capabilities. To address the issue of CF in a general-purpose PEFT framework, we propose extbf{Lo}w-damage extbf{K}nowledge extbf{I}mplanting ( extbf{LoKI}), a PEFT technique that is based on a mechanistic understanding of how knowledge is stored in transformer architectures. In two real-world scenarios, LoKI demonstrates task-specific performance that is comparable to or even surpasses that of full fine-tuning and LoRA-based methods across various model types, while significantly better preserving general capabilities. Our work connects mechanistic insights into LLM knowledge storage with practical fine-tuning objectives, achieving state-of-the-art trade-offs between task specialization and the preservation of general capabilities. Our implementation is publicly available as ready-to-use codefootnote{https://github.com/Nexround/LoKI}.