🤖 AI Summary
Model ensembling via parameter averaging enables zero-shot multi-task learning but suffers from cross-task parameter interference, degrading performance. This work reveals that weight averaging implicitly constructs centralized task vectors, whose principal components encode task-specific knowledge. Building on this insight, we propose a “centralization + low-rank approximation” fusion framework: first, task vectors are centralized to eliminate shared biases; then, singular value decomposition (SVD) is applied for low-rank approximation, effectively suppressing interference. The method yields robust and scalable performance gains across visual benchmarks—supporting arbitrary numbers of tasks and model scales—and achieves competitive results on NLP multi-task benchmarks. Our core contributions are twofold: (i) establishing, for the first time, a theoretical connection between weight averaging and task vector centralization; and (ii) introducing a geometry-inspired low-rank fusion paradigm grounded in task vector structure.
📝 Abstract
Model merging aims to build a multi-task learner by combining the parameters of individually fine-tuned models without additional training. While a straightforward approach is to average model parameters across tasks, this often results in suboptimal performance due to interference among parameters across tasks. In this paper, we present intriguing results that weight averaging implicitly induces task vectors centered around the weight averaging itself and that applying a low-rank approximation to these centered task vectors significantly improves merging performance. Our analysis shows that centering the task vectors effectively reduces task interference and most of task-specific knowledge is concentrated in the top singular vectors. Our method demonstrates robust and scalable performance on vision benchmarks across varying numbers of tasks and model sizes. Furthermore, we observe that our approach is applicable to natural language processing tasks with competitive performance.