🤖 AI Summary
Existing self-supervised learning (SSL) methods rely on explicitly constructed positive sample pairs to enforce similarity, limiting their ability to model continuous similarity relationships among samples and lacking uncertainty quantification—resulting in poor robustness to out-of-distribution (OOD) inputs. To address these limitations, we propose Gaussian Process Self-Supervised Learning (GPSSL), the first SSL framework to incorporate a Gaussian process (GP) prior over the representation space. GPSSL implicitly enforces smoothness and similarity without manual positive-pair design, encoding structural priors via kernel functions and performing Bayesian representation learning through generalized posterior optimization. This formulation naturally enables uncertainty propagation and calibration. Experiments across diverse classification and regression benchmarks demonstrate that GPSSL significantly improves predictive accuracy while achieving superior error control and OOD detection performance compared to state-of-the-art SSL approaches.
📝 Abstract
Self supervised learning (SSL) is a machine learning paradigm where models learn to understand the underlying structure of data without explicit supervision from labeled samples. The acquired representations from SSL have demonstrated useful for many downstream tasks including clustering, and linear classification, etc. To ensure smoothness of the representation space, most SSL methods rely on the ability to generate pairs of observations that are similar to a given instance. However, generating these pairs may be challenging for many types of data. Moreover, these methods lack consideration of uncertainty quantification and can perform poorly in out-of-sample prediction settings. To address these limitations, we propose Gaussian process self supervised learning (GPSSL), a novel approach that utilizes Gaussian processes (GP) models on representation learning. GP priors are imposed on the representations, and we obtain a generalized Bayesian posterior minimizing a loss function that encourages informative representations. The covariance function inherent in GPs naturally pulls representations of similar units together, serving as an alternative to using explicitly defined positive samples. We show that GPSSL is closely related to both kernel PCA and VICReg, a popular neural network-based SSL method, but unlike both allows for posterior uncertainties that can be propagated to downstream tasks. Experiments on various datasets, considering classification and regression tasks, demonstrate that GPSSL outperforms traditional methods in terms of accuracy, uncertainty quantification, and error control.