ACE-Merging: Data-Free Model Merging with Adaptive Covariance Estimation

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses performance interference in merging multi-task expert models under the challenging constraints of no data, no retraining, and no architectural modifications—conditions where conflicting task objectives typically degrade performance. To this end, the authors propose ACE-Merging, an adaptive covariance estimation framework that leverages theoretical insights revealing how parameter differences between fine-tuned models implicitly encode task input covariances. Building on this observation, ACE-Merging derives a closed-form solution for efficient, data-free model fusion. As the first method to implicitly estimate task covariances without any access to data, it departs from existing heuristic or iterative approaches by offering a principled, non-iterative merging strategy. Empirical results demonstrate state-of-the-art performance in fully data-free settings across vision and language benchmarks, with an average absolute improvement of 4% over seven tasks on GPT-2, achieving both high accuracy and low computational overhead.

Technology Category

Application Category

📝 Abstract
Model merging aims to combine multiple task-specific expert models into a single model while preserving generalization across diverse tasks. However, interference among experts, especially when they are trained on different objectives, often leads to significant performance degradation. Despite recent progress, resolving this interference without data access, retraining, or architectural modification remains a fundamental challenge. This paper provides a theoretical analysis demonstrating that the input covariance of each task, which is a key factor for optimal merging, can be implicitly estimated from the parameter differences of its fine-tuned model, even in a fully data-free setting. Building on this insight, we introduce \acem, an Adaptive Covariance Estimation framework that effectively mitigates inter-task interference. Our approach features a principled, closed-form solution that contrasts with prior iterative or heuristic methods. Extensive experiments on both vision and language benchmarks demonstrate that \acem sets a new state-of-the-art among data-free methods. It consistently outperforms existing baselines; for example, \acem achieves an average absolute improvement of 4\% over the previous methods across seven tasks on GPT-2. Owing to its efficient closed-form formulation, \acem delivers superior performance with a modest computational cost, providing a practical and theoretically grounded solution for model merging.
Problem

Research questions and friction points this paper is trying to address.

model merging
data-free
task interference
covariance estimation
expert models
Innovation

Methods, ideas, or system contributions that make the work stand out.

data-free model merging
adaptive covariance estimation
closed-form solution
inter-task interference
parameter difference analysis
🔎 Similar Papers
No similar papers found.