Evolutionary Optimization of Model Merging Recipes

📅 2024-03-19

🏛️ arXiv.org

📈 Citations: 61

✨ Influential: 6

career value

197K/year

🤖 AI Summary

Existing large language model (LLM) merging techniques suffer from unstable fusion performance, heavy reliance on manual expertise, and high computational overhead. Method: This paper proposes an automated model merging framework based on Differential Evolution (DE), the first to systematically integrate multi-objective evolutionary algorithms into model merging recipe search. It jointly optimizes critical merging strategies—including weight allocation, layer selection, and normalization schemes—to simultaneously enhance performance, efficiency, and generalization. Leveraging a parameterized merging template and a zero-shot task validation mechanism, the framework eliminates the need for additional fine-tuning or labeled data. Contribution/Results: Evaluated across 12 downstream tasks, the method achieves an average accuracy improvement of 2.7%, significantly outperforming handcrafted recipes and simple averaging baselines. Moreover, its search cost is reduced by 90% compared to exhaustive grid search.