EvoGM: Learning to Merge LLMs via Evolutionary Generative Optimization

πŸ“… 2026-05-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitations of existing large language model fusion approaches, which rely on handcrafted stochastic operators and struggle to effectively explore the performance landscape of fusion coefficient space. The authors propose EvoGM, an evolutionary generative fusion framework that introduces learnable generative modeling into evolutionary model fusion for the first time. EvoGM employs a dual-generator architecture with cycle-consistency learning to adaptively sample and refine high-potential fusion candidates. By constructing win-loss pairs from historical search trajectories, it efficiently models the distribution of high-performing parameters, eliminating reliance on manual heuristics. Integrated with a multi-round elite iteration mechanism, EvoGM significantly outperforms current methods across diverse benchmark tasks, demonstrating consistently superior performance on both seen and unseen tasks.
πŸ“ Abstract
Evolutionary model merging provides a powerful framework for the automated, training-free composition of LLMs through parameter-space search. However, existing methods predominantly rely on stochastic, hand-crafted operators that overlook the underlying performance landscape of the coefficient space. We propose Evolutionary Generative Merging (EvoGM), a framework that transcends manual heuristics by employing learnable generative modeling to optimize merging coefficients. Specifically, EvoGM features a dual-generator architecture with cycle-consistent learning to adaptively sample and refine promising merging candidates. By constructing winner-loser pairs from historical search trajectories, our framework effectively captures high-performance parameter distributions and maximizes data efficiency. This generative process is seamlessly integrated into a multi-round evolutionary pipeline, where elite merged models iteratively serve as new expert foundations. Extensive experiments across diverse benchmarks demonstrate that EvoGM significantly outperforms state-of-the-art baselines, exhibiting robust performance on both seen and unseen tasks. Code and data are available at https://github.com/JiangTao97/evogm.
Problem

Research questions and friction points this paper is trying to address.

model merging
large language models
evolutionary optimization
parameter-space search
coefficient space
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evolutionary Generative Merging
learnable generative modeling
dual-generator architecture
cycle-consistent learning
parameter-space search
πŸ”Ž Similar Papers
T
Tao Jiang
Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology; Pengcheng Laboratory
X
Xinmeng Yu
Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology; Pengcheng Laboratory
C
Chenhao Yi
University of Chinese Academy of Sciences; Pengcheng Laboratory
Y
Yiling Wu
Pengcheng Laboratory
Y
Yan Li
Pengcheng Laboratory
R
Ran Cheng
Department of Data Science and Artificial Intelligence, The Hong Kong Polytechnic University; Hong Kong Polytechnic University Shenzhen Research Institute; Hong Kong Polytechnic University-Daya Bay Technology and Innovation Research Institute
Dongmei Jiang
Dongmei Jiang
Northwestern Polytechnical University; Peng Cheng Laboratory
Affective ComputingMultimodal emotion recognitionMultimodal mental state evaluation
Jianguo Zhang
Jianguo Zhang
Professor, Southern University of Science and Technology
Object RecognitionComputer VisionImage ProcessingVisual Surveillance