🤖 AI Summary
This work addresses source-free semi-supervised regression under domain shift—where source data is inaccessible, target labels are scarce, and distributional divergence exists—exemplified by real-world medical tasks such as EEG-based gaze estimation and brain age prediction. We propose the first deep transfer learning framework tailored to source-free semi-supervised regression. Unlike existing approaches relying on source data or explicit domain alignment, our method pioneers the adaptation of the Contradistinguisher framework to this setting. It leverages a pre-trained model to establish a novel regularization mechanism, jointly optimizing a small labeled loss and an unsupervised consistency constraint for end-to-end cross-domain knowledge transfer. Extensive experiments on two realistic regression benchmarks demonstrate that our method reduces RMSE by up to 9% over fine-tuning baselines and consistently outperforms four state-of-the-art source-free methods, achieving an average improvement exceeding 3%.
📝 Abstract
Deep learning models deployed in real-world applications (e.g., medicine) face challenges because source models do not generalize well to domain-shifted target data. Many successful domain adaptation (DA) approaches require full access to source data. Yet, such requirements are unrealistic in scenarios where source data cannot be shared either because of privacy concerns or because it is too large and incurs prohibitive storage or computational costs. Moreover, resource constraints may limit the availability of labeled targets. We illustrate this challenge in a neuroscience setting where source data are unavailable, labeled target data are meager, and predictions involve continuous-valued outputs. We build upon Contradistinguisher (CUDA), an efficient framework that learns a shared model across the labeled source and unlabeled target samples, without intermediate representation alignment. Yet, CUDA was designed for unsupervised DA, with full access to source data, and for classification tasks. We develop CRAFT -- a Contradistinguisher-based Regularization Approach for Flexible Training -- for source-free (SF), semi-supervised transfer of pretrained models in regression tasks. We showcase the efficacy of CRAFT in two neuroscience settings: gaze prediction with electroencephalography (EEG) data and ``brain age'' prediction with structural MRI data. For both datasets, CRAFT yielded up to 9% improvement in root-mean-squared error (RMSE) over fine-tuned models when labeled training examples were scarce. Moreover, CRAFT leveraged unlabeled target data and outperformed four competing state-of-the-art source-free domain adaptation models by more than 3%. Lastly, we demonstrate the efficacy of CRAFT on two other real-world regression benchmarks. We propose CRAFT as an efficient approach for source-free, semi-supervised deep transfer for regression that is ubiquitous in biology and medicine.