Pseudo-Labeling for Unsupervised Domain Adaptation with Kernel GLMs

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses unsupervised domain adaptation under covariate shift by proposing a pseudo-labeling framework based on kernelized generalized linear models (GLMs). The method generates high-quality pseudo-labels for the target domain through an imputation model constructed from source-domain candidate models trained in batches. Robust model selection is achieved via a two-stage data partitioning strategy combined with ridge-regularized kernelized linear, logistic, and Poisson regressions. Theoretically, the authors establish non-asymptotic excess risk bounds and introduce the notion of “effective labeled sample size” to explicitly quantify the impact of covariate shift on adaptation performance. Experimental results demonstrate that the proposed approach significantly outperforms source-only baselines on both synthetic and real-world datasets.

Technology Category

Application Category

📝 Abstract

We propose a principled framework for unsupervised domain adaptation under covariate shift in kernel Generalized Linear Models (GLMs), encompassing kernelized linear, logistic, and Poisson regression with ridge regularization. Our goal is to minimize prediction error in the target domain by leveraging labeled source data and unlabeled target data, despite differences in covariate distributions. We partition the labeled source data into two batches: one for training a family of candidate models, and the other for building an imputation model. This imputation model generates pseudo-labels for the target data, enabling robust model selection. We establish non-asymptotic excess-risk bounds that characterize adaptation performance through an "effective labeled sample size", explicitly accounting for the unknown covariate shift. Experiments on synthetic and real datasets demonstrate consistent performance gains over source-only baselines.

Problem

Research questions and friction points this paper is trying to address.

unsupervised domain adaptation

covariate shift

kernel GLMs

pseudo-labeling

target prediction error

Innovation

Methods, ideas, or system contributions that make the work stand out.

pseudo-labeling

unsupervised domain adaptation

kernel GLMs