Multi-Level Collaboration in Model Merging

📅 2025-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether parameter-level model fusion can approximate the performance of prediction-level ensembling—without constraints on model count, architecture type, or pretraining initialization. To this end, we propose NeuLig, a novel, theory-driven fusion framework that enables robust parameter fusion across multiple models with heterogeneous architectures (e.g., CLIP-ViT-B/32) and distinct initializations—the first method to achieve this. NeuLig integrates theoretical modeling, customized loss design, and joint optimization to align fused parameters with ensemble behavior. On a five-model fusion benchmark, it achieves 95.44% accuracy, nearly matching the ensemble’s 95.46%, and substantially outperforms prior fusion approaches. Our work establishes a general theoretical connection between parameter fusion and ensembling, empirically validating that fusion can attain high accuracy, strong scalability, and architecture-agnosticism. These results open a new paradigm for efficient model compression and collaborative learning.

Technology Category

Application Category

📝 Abstract
Parameter-level model merging is an emerging paradigm in multi-task learning with significant promise. Previous research has explored its connections with prediction-level model ensembling-commonly viewed as the upper bound for merging-to reveal the potential of achieving performance consistency between the two. However, this observation relies on certain preconditions, such as being limited to two models, using ViT-based models, and all models are fine-tuned from the same pre-trained checkpoint. To further understand the intrinsic connections between model merging and model ensembling, this paper explores an interesting possibility: If these restrictions are removed, can performance consistency still be achieved between merging and ensembling? To answer this question, we first theoretically establish a performance correlation between merging and ensembling. We find that even when previous restrictions are not met, there is still a way for model merging to attain a near-identical and superior performance similar to that of ensembling. To verify whether our findings are practical, we introduce a validation framework termed Neural Ligand (NeuLig). The learning process of NeuLig is meticulously designed with a specialized loss function supported by theoretical foundations. Experimental results demonstrate the robust resilience of NeuLig in terms of both model scale and the number of collaborating models. For instance, for the case involving 5 CLIP-ViT-B/32 models, parameter-level merging achieves the same performance as prediction-level ensembling (merging: 95.44% vs. ensembling: 95.46%).
Problem

Research questions and friction points this paper is trying to address.

Explores performance consistency between model merging and ensembling.
Investigates if merging can match ensembling without prior restrictions.
Introduces NeuLig to validate theoretical findings on model merging.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-level merging achieves ensembling-like performance.
NeuLig framework validates merging-ensembling performance correlation.
Specialized loss function enhances model merging resilience.
🔎 Similar Papers
No similar papers found.