Subspace-Boosted Model Merging

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-expert model fusion suffers from performance degradation due to rank collapse in the task vector space—a phenomenon newly identified in this work. Method: We propose a subspace-enhanced fusion framework comprising (1) subspace regularization and projection enhancement to explicitly preserve the rank of the task vector space; (2) higher-order generalized singular value decomposition (HO-GSVD) to quantify task similarity, improving interpretability; and (3) efficient fusion via task arithmetic combined with SVD. Results: On visual benchmarks, fusing 20 expert models yields over 10% performance gain, significantly mitigating marginal diminishing returns while enhancing robustness and cross-task generalization.

Technology Category

Application Category

📝 Abstract
Model merging enables the combination of multiple specialized expert models into a single model capable of performing multiple tasks. However, the benefits of merging an increasing amount of specialized experts generally lead to diminishing returns and reduced overall performance gains. In this work, we offer an explanation and analysis from a task arithmetic perspective; revealing that as the merging process (across numerous existing merging methods) continues for more and more experts, the associated task vector space experiences rank collapse. To mitigate this issue, we introduce Subspace Boosting, which operates on the singular value decomposed task vector space and maintains task vector ranks. Subspace Boosting raises merging efficacy for up to 20 expert models by large margins of more than 10% when evaluated on vision benchmarks. Moreover, we propose employing Higher-Order Generalized Singular Value Decomposition to further quantify task similarity, offering a new interpretable perspective on model merging.
Problem

Research questions and friction points this paper is trying to address.

Mitigates rank collapse in task vector space during model merging
Improves merging efficacy for up to 20 expert models
Quantifies task similarity using Higher-Order Generalized SVD
Innovation

Methods, ideas, or system contributions that make the work stand out.

Subspace Boosting maintains task vector ranks
Higher-Order Generalized SVD quantifies task similarity
Singular value decomposition enhances merging efficacy
🔎 Similar Papers
No similar papers found.
R
Ronald Skorobogat
Technical University of Munich, School of Computation, Information and Technology; Helmholtz Munich
Karsten Roth
Karsten Roth
Research Scientist at Google DeepMind
Foundation ModelsContinual LearningPost-TrainingVision and Language
M
Mariana-Iuliana Georgescu
Technical University of Munich, School of Computation, Information and Technology; Helmholtz Munich; Munich Center for Machine Learning (MCML); Munich Data Science Institute (MDSI)
Zeynep Akata
Zeynep Akata
Professor at Technical University of Munich and Director at Helmholtz Munich
Machine LearningVision and LanguageZero-Shot Learning