Parameter Importance is Not Static: Evolving Parameter Isolation for Supervised Fine-Tuning

📅 2026-04-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

212K/year
🤖 AI Summary
This work addresses the pervasive challenges of task interference and catastrophic forgetting in supervised fine-tuning, which existing parameter isolation methods fail to adequately resolve due to their reliance on static assumptions about parameter importance. To overcome this limitation, the authors propose the Evolving Parameter Isolation (EPI) framework, which is the first to explicitly model and exploit the temporal drift of parameter importance during training. EPI employs online gradient estimation and a dynamic mask-updating mechanism to adaptively adjust parameter isolation strategies throughout the training process. This enables the dynamic protection and release of critical parameters, substantially mitigating interference and forgetting in multi-task settings. Empirical results demonstrate that EPI consistently outperforms both static isolation approaches and standard fine-tuning across multiple benchmarks, leading to significantly improved model generalization.

Technology Category

Application Category

📝 Abstract
Supervised Fine-Tuning (SFT) of large language models often suffers from task interference and catastrophic forgetting. Recent approaches alleviate this issue by isolating task-critical parameters during training. However, these methods represent a static solution to a dynamic problem, assuming that parameter importance remains fixed once identified. In this work, we empirically demonstrate that parameter importance exhibits temporal drift over the course of training. To address this, we propose Evolving Parameter Isolation (EPI), a fine-tuning framework that adapts isolation decisions based on online estimates of parameter importance. Instead of freezing a fixed subset of parameters, EPI periodically updates isolation masks using gradient-based signals, enabling the model to protect emerging task-critical parameters while releasing outdated ones to recover plasticity. Experiments on diverse multi-task benchmarks demonstrate that EPI consistently reduces interference and forgetting compared to static isolation and standard fine-tuning, while improving overall generalization. Our analysis highlights the necessity of synchronizing isolation mechanisms with the evolving dynamics of learning diverse abilities.
Problem

Research questions and friction points this paper is trying to address.

parameter importance
supervised fine-tuning
task interference
catastrophic forgetting
temporal drift
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evolving Parameter Isolation
parameter importance drift
dynamic parameter isolation
supervised fine-tuning
catastrophic forgetting