🤖 AI Summary
This work addresses fundamental challenges in model merging—including the absence of a unified taxonomy, terminological inconsistency, incomparable methodologies, and difficulties in multi-task fusion under data-unavailable scenarios. We propose the first three-tiered classification paradigm encompassing weight-space fusion, gradient alignment, and task disentanglement. We establish a cross-method reproducible evaluation benchmark and formally define and distinguish the applicability boundaries of “data-agnostic” versus “data-aware” merging. By unifying the theoretical formulations of over 20 state-of-the-art methods—via spectral analysis, normalization sensitivity diagnosis, and task vector geometric modeling—we identify three root causes of merging failure: directional conflict, scale mismatch, and task entanglement. Our framework provides systematic theoretical foundations and principled design guidelines for efficient, lightweight, and interpretable model fusion.
📝 Abstract
Model merging has achieved significant success, with numerous innovative methods proposed to enhance capabilities by combining multiple models. However, challenges persist due to the lack of a unified framework for classification and systematic comparative analysis, leading to inconsistencies in terminologies and categorizations. Meanwhile, as an increasing number of fine-tuned models are publicly available, their original training data often remain inaccessible due to privacy concerns or intellectual property restrictions. This makes traditional multi-task learning based on shared training data impractical. In scenarios where direct access to training data is infeasible, merging model parameters to create a unified model with broad generalization across multiple domains becomes crucial, further underscoring the importance of model merging techniques. Despite the rapid progress in this field, a comprehensive taxonomy and survey summarizing recent advances and predicting future directions are still lacking. This paper addresses these gaps by establishing a new taxonomy of model merging methods, systematically comparing different approaches, and providing an overview of key developments. By offering a structured perspective on this evolving area, we aim to help newcomers quickly grasp the field's landscape and inspire further innovations.