From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches

📅 2025-03-12

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses fundamental challenges in model merging—including the absence of a unified taxonomy, terminological inconsistency, incomparable methodologies, and difficulties in multi-task fusion under data-unavailable scenarios. We propose the first three-tiered classification paradigm encompassing weight-space fusion, gradient alignment, and task disentanglement. We establish a cross-method reproducible evaluation benchmark and formally define and distinguish the applicability boundaries of “data-agnostic” versus “data-aware” merging. By unifying the theoretical formulations of over 20 state-of-the-art methods—via spectral analysis, normalization sensitivity diagnosis, and task vector geometric modeling—we identify three root causes of merging failure: directional conflict, scale mismatch, and task entanglement. Our framework provides systematic theoretical foundations and principled design guidelines for efficient, lightweight, and interpretable model fusion.

Technology Category

Application Category

📝 Abstract

Model merging has achieved significant success, with numerous innovative methods proposed to enhance capabilities by combining multiple models. However, challenges persist due to the lack of a unified framework for classification and systematic comparative analysis, leading to inconsistencies in terminologies and categorizations. Meanwhile, as an increasing number of fine-tuned models are publicly available, their original training data often remain inaccessible due to privacy concerns or intellectual property restrictions. This makes traditional multi-task learning based on shared training data impractical. In scenarios where direct access to training data is infeasible, merging model parameters to create a unified model with broad generalization across multiple domains becomes crucial, further underscoring the importance of model merging techniques. Despite the rapid progress in this field, a comprehensive taxonomy and survey summarizing recent advances and predicting future directions are still lacking. This paper addresses these gaps by establishing a new taxonomy of model merging methods, systematically comparing different approaches, and providing an overview of key developments. By offering a structured perspective on this evolving area, we aim to help newcomers quickly grasp the field's landscape and inspire further innovations.

Problem

Research questions and friction points this paper is trying to address.

Lack of unified framework for model merging classification.

Inaccessibility of original training data for fine-tuned models.

Need for comprehensive taxonomy and survey in model merging.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops taxonomy for model merging methods

Systematically compares diverse merging approaches

Provides overview of key model merging developments

🔎 Similar Papers

No similar papers found.