π€ AI Summary
This work proposes the DMM framework to address scenarios where data privacy or heterogeneity precludes centralized training, enabling effective fusion of highly heterogeneous domain models without accessing raw data. The approach follows a three-stage pipeline: first, individual domain models are trained independently; second, similar models are clustered and merged; third, normalized statistics are leveraged to synthesize pseudo-data for lightweight knowledge distillation of the aggregated model. DMM is the first method to achieve stable model fusion under strict no-data conditions while preserving rare yet critical knowledge through pseudo-dataβguided distillation. Experiments demonstrate that DMM consistently outperforms existing model fusion techniques on both single-modal and multimodal benchmarks.
π Abstract
Learning across domains is challenging when data cannot be centralized due to privacy or heterogeneity, which limits the ability to train a single comprehensive model. Model merging provides an appealing alternative by consolidating knowledge from multiple specialized models into one, avoiding data sharing and reducing retraining cost. In this work, we present DMM, a data-free model merging framework designed to handle highly divergent models. DMM proceeds in three steps. First, domain-specific models are trained independently. Second, models with high similarity are merged using standard techniques to ensure stability. Third, we synthesize pseudo-data from normalization statistics and distill knowledge from divergent models into the merged model through a lightweight refinement guided by these samples. This approach preserves rare but critical knowledge while maintaining stability. Extensive experiments on unimodal and multimodal benchmarks show that DMM achieves state-of-the-art performance over existing merging methods.