🤖 AI Summary
Existing large model editing methods rely on network linearization to derive task vectors, resulting in high computational overhead and poor weight decoupling—hindering conflict-free composition of task vectors. This paper proposes an efficient editing framework based on task-localized sparse fine-tuning. We first identify a subset of parameters in pretrained models exhibiting consistently low gradient sensitivity across diverse tasks; leveraging this property, we design a sparse update strategy that inherently enhances weight decoupling without requiring linearization. Building upon this, we construct composable and low-interference sparse task vectors. Experiments demonstrate that our method outperforms state-of-the-art approaches across training/inference efficiency, multi-task addition, and task negation capabilities. Moreover, it significantly improves deployment flexibility of foundation models in real-world scenarios. (149 words)
📝 Abstract
Task arithmetic has emerged as a promising approach for editing models by representing task-specific knowledge as composable task vectors. However, existing methods rely on network linearization to derive task vectors, leading to computational bottlenecks during training and inference. Moreover, linearization alone does not ensure weight disentanglement, the key property that enables conflict-free composition of task vectors. To address this, we propose TaLoS which allows to build sparse task vectors with minimal interference without requiring explicit linearization and sharing information across tasks. We find that pre-trained models contain a subset of parameters with consistently low gradient sensitivity across tasks, and that sparsely updating only these parameters allows for promoting weight disentanglement during fine-tuning. Our experiments prove that TaLoS improves training and inference efficiency while outperforming current methods in task addition and negation. By enabling modular parameter editing, our approach fosters practical deployment of adaptable foundation models in real-world applications.