🤖 AI Summary
This work addresses the challenge of effectively integrating multiple source priors and mitigating bias induced by over-parameterization in linear regression-based transfer learning with multiple pre-trained models. The authors propose a regularized optimization framework that minimizes squared error on target data while incorporating a regularization term measuring distance to multiple pre-trained models. To correct for systematic bias, a multiplicative debiasing correction factor is introduced. Theoretical analysis and empirical experiments demonstrate that, given a sufficient number of pre-trained models, the proposed method significantly reduces test error and enables transfer performance to improve with the number of models, thereby effectively unlocking the potential of multi-model ensembles in over-parameterized settings.
📝 Abstract
We study transfer learning for a linear regression task using several least-squares pretrained models that can be overparameterized.
We formulate the target learning task as optimization that minimizes squared errors on the target dataset with penalty on the distance of the learned model from the pretrained models. We analytically formulate the test error of the learned target model and provide the corresponding empirical evaluations.
Our results elucidate when using more pretrained models can improve transfer learning. Specifically, if the pretrained models are overparameterized, using sufficiently many of them is important for beneficial transfer learning. However, the learning may be compromised by overparameterization bias of pretrained models, i.e., the minimum $\ell_2$-norm solution's restriction to a small subspace spanned by the training examples in the high-dimensional parameter space. We propose a simple debiasing via multiplicative correction factor that can reduce the overparameterization bias and leverage more pretrained models to learn a target predictor.