🤖 AI Summary
This paper investigates how to leverage heterogeneous multi-source data to improve generalization in high-dimensional linear regression, focusing on the interplay between transfer learning and the minimum ℓ₂-norm interpolator (MNI). We propose a two-stage transfer MNI method and derive a non-asymptotic upper bound on the excess risk. Our analysis reveals, for the first time, the “free-lunch covariate shift” phenomenon: benign overfitting can exploit source–target covariate distribution mismatches to enhance performance. Furthermore, we develop a data-driven framework for source selection and ensemble transfer MNI, achieving significant gains in robustness and generalization at low computational cost. Theoretically, under moderate source–target task correlation and favorable signal-to-noise ratios, our method strictly outperforms the target-only MNI. Empirical results demonstrate strong adaptability to both model and data heterogeneity.
📝 Abstract
Transfer learning is a key component of modern machine learning, enhancing the performance of target tasks by leveraging diverse data sources. Simultaneously, overparameterized models such as the minimum-$ell_2$-norm interpolator (MNI) in high-dimensional linear regression have garnered significant attention for their remarkable generalization capabilities, a property known as benign overfitting. Despite their individual importance, the intersection of transfer learning and MNI remains largely unexplored. Our research bridges this gap by proposing a novel two-step Transfer MNI approach and analyzing its trade-offs. We characterize its non-asymptotic excess risk and identify conditions under which it outperforms the target-only MNI. Our analysis reveals free-lunch covariate shift regimes, where leveraging heterogeneous data yields the benefit of knowledge transfer at limited cost. To operationalize our findings, we develop a data-driven procedure to detect informative sources and introduce an ensemble method incorporating multiple informative Transfer MNIs. Finite-sample experiments demonstrate the robustness of our methods to model and data heterogeneity, confirming their advantage.