Contrastive Learning with Synthetic Positives

📅 2024-08-30

🏛️ European Conference on Computer Vision

📈 Citations: 1

✨ Influential: 0

career value

215K/year

🤖 AI Summary

In contrastive learning, nearest-neighbor positive samples are often “easy” pairs—already highly similar in embedding space—leading to insufficient positive diversity and limited representation discriminability. To address this, we propose DiffCLR: the first framework to introduce synthetically generated images from unconditional diffusion models as semantically consistent yet background-diverse *hard positives* into contrastive learning. We enable controllable, semantics-preserving image synthesis via latent-space feature interpolation and extend the contrastive loss to jointly optimize over both real and synthetic positives. Linear evaluation on CIFAR-10 surpasses NNCLR and All4One by >2% and 1%, respectively; in transfer learning across eight downstream tasks, DiffCLR outperforms competitors on six and achieves state-of-the-art performance. Our core contribution is a diffusion-based paradigm for generating hard positives, significantly enhancing the discriminability and generalization capability of contrastive representations.

Technology Category

Application Category

📝 Abstract

Contrastive learning with the nearest neighbor has proved to be one of the most efficient self-supervised learning (SSL) techniques by utilizing the similarity of multiple instances within the same class. However, its efficacy is constrained as the nearest neighbor algorithm primarily identifies"easy"positive pairs, where the representations are already closely located in the embedding space. In this paper, we introduce a novel approach called Contrastive Learning with Synthetic Positives (CLSP) that utilizes synthetic images, generated by an unconditional diffusion model, as the additional positives to help the model learn from diverse positives. Through feature interpolation in the diffusion model sampling process, we generate images with distinct backgrounds yet similar semantic content to the anchor image. These images are considered"hard"positives for the anchor image, and when included as supplementary positives in the contrastive loss, they contribute to a performance improvement of over 2% and 1% in linear evaluation compared to the previous NNCLR and All4One methods across multiple benchmark datasets such as CIFAR10, achieving state-of-the-art methods. On transfer learning benchmarks, CLSP outperforms existing SSL frameworks on 6 out of 8 downstream datasets. We believe CLSP establishes a valuable baseline for future SSL studies incorporating synthetic data in the training process.

Problem

Research questions and friction points this paper is trying to address.

Enhancing contrastive learning with synthetic positives

Generating diverse hard positives via diffusion models

Improving self-supervised learning performance benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses synthetic images from diffusion models

Generates hard positives via feature interpolation

Improves performance in contrastive learning

🔎 Similar Papers

SynCo: Synthetic Hard Negatives in Contrastive Learning for Better Unsupervised Visual Representations