AutoMerge: Search-Based Model Merging Framework for Effective Model Reuse

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Existing model merging methods suffer from poor generalization and unstable performance in cross-architecture and cross-domain scenarios. To address this limitation, this work proposes AutoMerge, a search-based framework for merging heterogeneous models. AutoMerge decomposes models into functional modules and automatically discovers optimal merging strategies through multi-strategy ensembling and hyperparameter optimization. This study presents the first systematic evaluation of model merging across diverse domains, demonstrating AutoMerge’s effectiveness on large language models, image classification, and autonomous driving tasks. The results show that AutoMerge significantly enhances both the stability and performance of merged models, overcoming the limitations of current approaches in terms of generality and robustness.

Technology Category

Application Category

📝 Abstract

Software reuse has long been recognized as a critical and widely studied topic in software engineering, offering substantial benefits in reducing development costs, improving software quality, and enhancing operational efficiency. This paradigm extends into deep learning through model reuse. Recently, model merging has emerged in the domain of large language models (LLMs) as a training-free approach that takes multiple task-specific models with the same architecture as source models and merges them without retraining, enhancing model reuse within LLMs. However, no prior work has systematically investigated whether such an approach can be effectively applied to other deep learning models with different architectures across domains. To bridge this gap, we present the first systematic study that evaluates five model merging techniques on three distinct model architectures across three domains: LLMs, image classification, and autonomous driving. Our findings reveal that directly applying existing model merging techniques leads to highly inconsistent results and falls notably short of their success within LLMs. Moreover, a single model merging technique often fails to handle the heterogeneous structural properties within a model, limiting its applicability to different model architectures across domains. Furthermore, the effectiveness of model merging techniques is highly sensitive to hyperparameter configurations, thereby constraining their potential for broader adoption. Inspired by these insights, we propose AutoMerge, a novel search-based model merging framework that first segments complex models into multiple heterogeneous blocks and then systematically explores the merging space to identify the merging technique and its hyperparameter configuration.

Problem

Research questions and friction points this paper is trying to address.

model merging

deep learning

heterogeneous architectures

hyperparameter sensitivity

model reuse

Innovation

Methods, ideas, or system contributions that make the work stand out.

model merging

search-based optimization

model reuse