🤖 AI Summary
Existing self-supervised learning (SSL) methods lack explicit modeling and theoretical characterization of representation “universality”—encompassing discriminability, generalizability, and transferability. This paper formally defines universality along three theoretically grounded dimensions and proposes GeSSL, the first SSL framework jointly optimizing for both learning dynamics and evaluation-oriented universality. Its core innovations include: (i) a σ-metric quantifying the gap between learned representations and task-optimal solutions; (ii) integration of task reinterpretation, causally invariant representation learning, multi-task ensemble optimization, and theory-driven loss minimization. We provide theoretical guarantees that GeSSL enhances discriminability and cross-task generalization. Empirically, GeSSL outperforms state-of-the-art SSL methods across multiple benchmarks, and the σ-metric consistently validates its closer proximity to task-optimal performance.
📝 Abstract
In this paper, we investigate the characteristics that define a good representation or model. We propose that such a representation or model should possess universality, characterized by: (i) discriminability: performing well on training samples; (ii) generalization: performing well on unseen datasets; and (iii) transferability: performing well on unseen tasks with distribution shifts. Despite its importance, current self-supervised learning (SSL) methods lack explicit modeling of universality, and theoretical analysis remains underexplored. To address these issues, we aim to explore and incorporate universality into SSL. Specifically, we first revisit SSL from a task perspective and find that each mini-batch can be viewed as a multi-class classification task. We then propose that a universal SSL model should achieve: (i) learning universality by minimizing loss across all training samples, and (ii) evaluation universality by learning causally invariant representations that generalize well to unseen tasks. To quantify this, we introduce a $sigma$-measurement that assesses the gap between the performance of SSL model and optimal task-specific models. Furthermore, to model universality, we propose the GeSSL framework. It first learns task-specific models by minimizing SSL loss, then incorporates future updates to enhance discriminability, and finally integrates these models to learn from multiple tasks. Theoretical and empirical evidence supports the effectiveness of GeSSL.