Evolutionary Optimization of Model Merging Recipes

📅 2024-03-19
🏛️ arXiv.org
📈 Citations: 61
Influential: 6
📄 PDF
🤖 AI Summary
Existing large language model (LLM) merging techniques suffer from unstable fusion performance, heavy reliance on manual expertise, and high computational overhead. Method: This paper proposes an automated model merging framework based on Differential Evolution (DE), the first to systematically integrate multi-objective evolutionary algorithms into model merging recipe search. It jointly optimizes critical merging strategies—including weight allocation, layer selection, and normalization schemes—to simultaneously enhance performance, efficiency, and generalization. Leveraging a parameterized merging template and a zero-shot task validation mechanism, the framework eliminates the need for additional fine-tuning or labeled data. Contribution/Results: Evaluated across 12 downstream tasks, the method achieves an average accuracy improvement of 2.7%, significantly outperforming handcrafted recipes and simple averaging baselines. Moreover, its search cost is reduced by 90% compared to exhaustive grid search.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Automatic Model Combination
Resource Efficiency
Cross-domain Application
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated Model Fusion
Evolutionary Process Mimicry
Cross-Domain Model Creation
🔎 Similar Papers
No similar papers found.