🤖 AI Summary
This work proposes a multi-source transfer learning method for regression tasks under severe target-domain data scarcity, without requiring assumptions of covariate or label shift. The approach constructs conditional generative models for each heterogeneous source domain and aligns their distributions to the target domain through conditional quantile matching, thereby enabling high-quality data augmentation. Theoretical analysis establishes the convergence rate of the quantile-matching estimator and derives a tighter excess risk bound for the empirical risk minimizer trained on the augmented data. Experimental results demonstrate that the proposed method significantly outperforms both target-only baselines and existing transfer learning approaches on both synthetic and real-world datasets.
📝 Abstract
We introduce a transfer learning framework for regression that leverages heterogeneous source domains to improve predictive performance in a data-scarce target domain. Our approach learns a conditional generative model separately for each source domain and calibrates the generated responses to the target domain via conditional quantile matching. This distributional alignment step corrects general discrepancies between source and target domains without imposing restrictive assumptions such as covariate or label shift. The resulting framework provides a principled and flexible approach to high-quality data augmentation for downstream learning tasks in the target domain. From a theoretical perspective, we show that an empirical risk minimizer (ERM) trained on the augmented dataset achieves a tighter excess risk bound than the target-only ERM under mild conditions. In particular, we establish new convergence rates for the quantile matching estimator that governs the transfer bias-variance tradeoff. From a practical perspective, extensive simulations and real data applications demonstrate that the proposed method consistently improves prediction accuracy over target-only learning and competing transfer learning methods.