Clustering Properties of Self-Supervised Learning

📅 2025-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the underexploited clustering structure of representations in self-supervised learning (SSL). We first systematically demonstrate that encoder outputs exhibit superior and more stable clustering properties than projection head outputs. Building on this insight, we propose Representation Soft Assignment (ReSA), a positive-feedback-based representation optimization method. Within a joint-embedding framework, ReSA constructs self-guided targets via soft cluster assignments and jointly optimizes representations through contrastive learning and representation stability regularization, enabling fine- and coarse-grained collaborative unsupervised learning. ReSA requires no additional labels or architectural modules, instead leveraging a clustering-driven positive-feedback mechanism to enhance model self-amplification. On standard SSL benchmarks, ReSA significantly outperforms state-of-the-art methods, yielding consistent improvements across representation structuring, semantic separability, and multiple clustering metrics.

Technology Category

Application Category

📝 Abstract
Self-supervised learning (SSL) methods via joint embedding architectures have proven remarkably effective at capturing semantically rich representations with strong clustering properties, magically in the absence of label supervision. Despite this, few of them have explored leveraging these untapped properties to improve themselves. In this paper, we provide an evidence through various metrics that the encoder's output $encoding$ exhibits superior and more stable clustering properties compared to other components. Building on this insight, we propose a novel positive-feedback SSL method, termed Representation Soft Assignment (ReSA), which leverages the model's clustering properties to promote learning in a self-guided manner. Extensive experiments on standard SSL benchmarks reveal that models pretrained with ReSA outperform other state-of-the-art SSL methods by a significant margin. Finally, we analyze how ReSA facilitates better clustering properties, demonstrating that it effectively enhances clustering performance at both fine-grained and coarse-grained levels, shaping representations that are inherently more structured and semantically meaningful.
Problem

Research questions and friction points this paper is trying to address.

Self-Supervised Learning
Representation Accuracy
Data Organization
Innovation

Methods, ideas, or system contributions that make the work stand out.

ReSA
Self-supervised Learning
Soft Allocation
🔎 Similar Papers
No similar papers found.
X
Xi Weng
School of Computing, National University of Singapore
J
Jianing An
SKLCCSE, School of Artificial Intelligence, Beihang University
X
Xudong Ma
SKLCCSE, School of Artificial Intelligence, Beihang University
Binhang Qi
Binhang Qi
National University of Singapore
DNN ModularizationModel ReuseSoftware EngineeringDeep Learning
J
Jie Luo
SKLCCSE, School of Artificial Intelligence, Beihang University
X
Xi Yang
Beijing Academy of Artificial Intelligence
Jin Song Dong
Jin Song Dong
Professor of Computer Science, National University of Singapore
Formal MethodsTrusted AISafe AIModel CheckingSports Analytics
L
Lei Huang
SKLCCSE, School of Artificial Intelligence, Beihang University