Cluster Contrast for Unsupervised Visual Representation Learning

📅 2025-07-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In unsupervised visual representation learning, feature spaces often struggle to simultaneously achieve inter-class separation and intra-class compactness. To address this, we propose Cluster Contrast (CueCo), an end-to-end framework that jointly optimizes contrastive learning and clustering objectives. Its core innovation is a query-key network architecture, where a momentum-updated key encoder enables joint optimization of the contrastive loss and a clustering-driven compactness constraint—explicitly structuring the feature distribution. Implemented atop ResNet-18, CueCo requires no manual annotations or strong data augmentations. Under the standard linear evaluation protocol, it achieves 91.40%, 68.56%, and 78.65% top-1 accuracy on CIFAR-10, CIFAR-100, and ImageNet-100, respectively—outperforming state-of-the-art unsupervised methods. CueCo thus establishes a novel, interpretable, and structurally guided paradigm for unsupervised representation learning.

Technology Category

Application Category

📝 Abstract
We introduce Cluster Contrast (CueCo), a novel approach to unsupervised visual representation learning that effectively combines the strengths of contrastive learning and clustering methods. Inspired by recent advancements, CueCo is designed to simultaneously scatter and align feature representations within the feature space. This method utilizes two neural networks, a query and a key, where the key network is updated through a slow-moving average of the query outputs. CueCo employs a contrastive loss to push dissimilar features apart, enhancing inter-class separation, and a clustering objective to pull together features of the same cluster, promoting intra-class compactness. Our method achieves 91.40% top-1 classification accuracy on CIFAR-10, 68.56% on CIFAR-100, and 78.65% on ImageNet-100 using linear evaluation with a ResNet-18 backbone. By integrating contrastive learning with clustering, CueCo sets a new direction for advancing unsupervised visual representation learning.
Problem

Research questions and friction points this paper is trying to address.

Combines contrastive learning and clustering for unsupervised visual representation
Enhances inter-class separation and intra-class compactness in feature space
Achieves high accuracy on CIFAR and ImageNet benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines contrastive learning with clustering methods
Uses two neural networks for feature alignment
Employs contrastive and clustering loss functions