๐ค AI Summary
Medical imaging often suffers from limited annotated data, constraining deep learning performance; existing transfer learning approaches typically rely on models with identical initialization, limiting effective integration of complementary features from heterogeneous pre-trained models. This paper proposes a kernel-level weight adaptive fusion method that operates across different initializations and tasks: it dynamically aggregates parameters from multiple pre-trained sources via a learnable kernel-wise weighting mechanism, coupled with feature distillation and transfer-aware optimization. To our knowledge, this is the first end-to-end weight fusion framework for medical imaging that seamlessly integrates models trained from disparate initializations and on divergent tasksโbreaking the conventional paradigm of same-initialization fusion. Evaluated on multiple downstream medical imaging tasks, the method achieves up to a 7% improvement in F1 score over single-model transfer baselines and significantly outperforms naive averaging and linear interpolation fusion strategies.
๐ Abstract
Transfer learning has become a powerful tool to initialize deep learning models to achieve faster convergence and higher performance. This is especially useful in the medical imaging analysis domain, where data scarcity limits possible performance gains for deep learning models. Some advancements have been made in boosting the transfer learning performance gain by merging models starting from the same initialization. However, in the medical imaging analysis domain, there is an opportunity to merge models starting from different initializations, thus combining the features learned from different tasks. In this work, we propose MedMerge, a method whereby the weights of different models can be merged, and their features can be effectively utilized to boost performance on a new task. With MedMerge, we learn kernel-level weights that can later be used to merge the models into a single model, even when starting from different initializations. Testing on various medical imaging analysis tasks, we show that our merged model can achieve significant performance gains, with up to 7% improvement on the F1 score. The code implementation of this work is available at github.com/BioMedIA-MBZUAI/MedMerge.