LoKI: Low-damage Knowledge Implanting of Large Language Models

๐Ÿ“… 2025-05-28
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address catastrophic forgettingโ€”a pervasive issue in parameter-efficient fine-tuning (PEFT) of large language models (LLMs)โ€”this paper proposes a low-damage knowledge implantation method. Our approach is the first to integrate mechanistic interpretability insights into LLM internal knowledge storage directly into PEFT design. Grounded in analysis of Transformer knowledge representation mechanisms, it combines low-rank adaptation with a knowledge-anchoring injection strategy to explicitly preserve general-purpose capabilities while adapting to downstream tasks. Experiments across multiple LLMs and realistic scenarios demonstrate that our method matches the task performance of full fine-tuning and LoRA, while significantly outperforming existing PEFT methods in retaining general capabilities. This achieves a superior trade-off between task-specific adaptation and generalization preservation.

Technology Category

Application Category

๐Ÿ“ Abstract
Fine-tuning adapts pretrained models for specific tasks but poses the risk of catastrophic forgetting (CF), where critical knowledge from pre-training is overwritten. Current Parameter-Efficient Fine-Tuning (PEFT) methods for Large Language Models (LLMs), while efficient, often sacrifice general capabilities. To address the issue of CF in a general-purpose PEFT framework, we propose extbf{Lo}w-damage extbf{K}nowledge extbf{I}mplanting ( extbf{LoKI}), a PEFT technique that is based on a mechanistic understanding of how knowledge is stored in transformer architectures. In two real-world scenarios, LoKI demonstrates task-specific performance that is comparable to or even surpasses that of full fine-tuning and LoRA-based methods across various model types, while significantly better preserving general capabilities. Our work connects mechanistic insights into LLM knowledge storage with practical fine-tuning objectives, achieving state-of-the-art trade-offs between task specialization and the preservation of general capabilities. Our implementation is publicly available as ready-to-use codefootnote{https://github.com/Nexround/LoKI}.
Problem

Research questions and friction points this paper is trying to address.

Preventing catastrophic forgetting in fine-tuned LLMs
Balancing task-specific performance and general capabilities
Mechanistic understanding of knowledge storage in transformers
Innovation

Methods, ideas, or system contributions that make the work stand out.

LoKI minimizes catastrophic forgetting in LLMs
Mechanistic understanding of transformer knowledge storage
Balances task performance and general capability preservation
๐Ÿ”Ž Similar Papers
No similar papers found.
R
Runyu Wang
Nantong University
P
Peng Ping
Nantong University
Z
Zhengyu Guo
South China University of Technology
X
Xiaoye Zhang
China Southern Power Grid Company Limited
Q
Quan Shi
Nantong University
Liting Zhou
Liting Zhou
Assistant Professor in Dublin City University
Educational TechnologyPeers LearningPsycologyArtificial IntelligenceLifelogging
Tianbo Ji
Tianbo Ji
Nantong University
Natural Language ProcessingLarge Language Model