Efficient Model Editing with Task-Localized Sparse Fine-tuning

📅 2025-04-03

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Existing large model editing methods rely on network linearization to derive task vectors, resulting in high computational overhead and poor weight decoupling—hindering conflict-free composition of task vectors. This paper proposes an efficient editing framework based on task-localized sparse fine-tuning. We first identify a subset of parameters in pretrained models exhibiting consistently low gradient sensitivity across diverse tasks; leveraging this property, we design a sparse update strategy that inherently enhances weight decoupling without requiring linearization. Building upon this, we construct composable and low-interference sparse task vectors. Experiments demonstrate that our method outperforms state-of-the-art approaches across training/inference efficiency, multi-task addition, and task negation capabilities. Moreover, it significantly improves deployment flexibility of foundation models in real-world scenarios. (149 words)

Technology Category

Application Category

📝 Abstract

Task arithmetic has emerged as a promising approach for editing models by representing task-specific knowledge as composable task vectors. However, existing methods rely on network linearization to derive task vectors, leading to computational bottlenecks during training and inference. Moreover, linearization alone does not ensure weight disentanglement, the key property that enables conflict-free composition of task vectors. To address this, we propose TaLoS which allows to build sparse task vectors with minimal interference without requiring explicit linearization and sharing information across tasks. We find that pre-trained models contain a subset of parameters with consistently low gradient sensitivity across tasks, and that sparsely updating only these parameters allows for promoting weight disentanglement during fine-tuning. Our experiments prove that TaLoS improves training and inference efficiency while outperforming current methods in task addition and negation. By enabling modular parameter editing, our approach fosters practical deployment of adaptable foundation models in real-world applications.

Problem

Research questions and friction points this paper is trying to address.

Improves model editing efficiency via sparse task vectors

Reduces computational bottlenecks in training and inference

Enhances weight disentanglement for conflict-free task composition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse task vectors without linearization

Updates only low gradient sensitivity parameters

Enhances weight disentanglement in fine-tuning

🔎 Similar Papers

No similar papers found.