🤖 AI Summary
This paper addresses the degradation of downstream classification performance in self-supervised transfer learning caused by representation distribution shift. We propose Distribution Matching (DM), a method that explicitly aligns source-domain representation distributions with a predefined reference distribution (e.g., Gaussian mixture) while jointly enforcing augmentation invariance—enabling fully unsupervised transfer. DM is the first to incorporate explicit distribution alignment into self-supervised transfer learning, backed by theoretical guarantees: (i) a unified theorem linking self-supervised objectives to downstream accuracy, and (ii) an end-to-end sample complexity bound for few-shot settings. Built upon contrastive learning, DM integrates reference distribution modeling with theoretically grounded loss design. Extensive experiments on multiple real-world datasets demonstrate that DM achieves state-of-the-art classification performance under few-shot target-domain evaluation.
📝 Abstract
In this paper, we propose a novel self-supervised transfer learning method called Distribution Matching (DM), which drives the representation distribution toward a predefined reference distribution while preserving augmentation invariance. The design of DM results in a learned representation space that is intuitively structured and offers easily interpretable hyperparameters. Experimental results across multiple real-world datasets and evaluation metrics demonstrate that DM performs competitively on target classification tasks compared to existing self-supervised transfer learning methods. Additionally, we provide robust theoretical guarantees for DM, including a population theorem and an end-to-end sample theorem. The population theorem bridges the gap between the self-supervised learning task and target classification accuracy, while the sample theorem shows that, even with a limited number of samples from the target domain, DM can deliver exceptional classification performance, provided the unlabeled sample size is sufficiently large.