π€ AI Summary
This work addresses the challenge of cross-task interference and representation drift in task arithmetic, where combining multiple task vectors often degrades performance. Existing decoupling approaches rely on external data, compromising modularity and privacy. To overcome this limitation, we propose the first data-free weight decoupling method, formulating representation drift regularization as a curvature matrix approximation problem. By leveraging Kronecker-Factored Approximate Curvature (K-FAC), we construct a data-independent regularizer that eliminates the need for any task-specific data. Our approach enables constant-complexity scaling with respect to the number of tasks, removes reliance on validation sets for hyperparameter tuning, and achieves state-of-the-art performance in both task addition and negation. Moreover, it exhibits robustness to task vector scaling and incurs no additional computational overhead as the number of tasks grows.
π Abstract
Task Arithmetic yields a modular, scalable way to adapt foundation models. Combining multiple task vectors, however, can lead to cross-task interference, causing representation drift and degraded performance. Representation drift regularization provides a natural remedy to disentangle task vectors; however, existing approaches typically require external task data, conflicting with modularity and data availability constraints (e.g., privacy requirements). We propose a dataless approach by framing regularization against representation drift as a curvature matrix approximation problem. This allows us to leverage well-established techniques; in particular, we adopt Kronecker-Factored Approximate Curvature and obtain a practical regularizer that achieves state-of-the-art results in task addition and negation. Our method has constant complexity in the number of tasks and promotes robustness to task vector rescaling, eliminating the need for held-out tuning.