🤖 AI Summary
To address the challenges of fine-grained knowledge extraction, inefficient aggregation, and suboptimal accuracy in multi-source transfer learning, this paper proposes a lightweight knowledge fusion framework based on Singular Value Decomposition (SVD). Methodologically, each source model is decomposed layer-wise into rank-one components; salient components are selected according to significance, and only the principal singular values of the fused matrix are fine-tuned for target-task adaptation. This mechanism achieves both high accuracy and efficiency: it avoids full-parameter fine-tuning, drastically reducing retraining overhead; exhibits robustness to noise and pruning-induced perturbations; and scales effectively to large, high-parameter models. Experiments demonstrate substantial performance gains across diverse multi-source transfer tasks, strong computational scalability, and establish a novel paradigm for efficient model knowledge reuse.
📝 Abstract
While transfer learning is an advantageous strategy, it overlooks the opportunity to leverage knowledge from numerous available models online. Addressing this multi-source transfer learning problem is a promising path to boost adaptability and cut re-training costs. However, existing approaches are inherently coarse-grained, lacking the necessary precision for granular knowledge extraction and the aggregation efficiency required to fuse knowledge from either a large number of source models or those with high parameter counts. We address these limitations by leveraging Singular Value Decomposition (SVD) to first decompose each source model into its elementary, rank-one components. A subsequent aggregation stage then selects only the most salient components from all sources, thereby overcoming the previous efficiency and precision limitations. To best preserve and leverage the synthesized knowledge base, our method adapts to the target task by fine-tuning only the principal singular values of the merged matrix. In essence, this process only recalibrates the importance of top SVD components. The proposed framework allows for efficient transfer learning, is robust to perturbations both at the input level and in the parameter space (e.g., noisy or pruned sources), and scales well computationally.