SynCo: Synthetic Hard Negatives in Contrastive Learning for Better Unsupervised Visual Representations

📅 2024-10-03

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

In contrastive learning, inefficient hard negative mining hinders the performance of self-supervised visual representation learning. To address this, we propose a lightweight, online representation-space synthesis strategy and, within the MoCo framework, introduce six annotation-free and augmentation-free hard negative generation methods—spanning interpolation, perturbation, mixing, and gradient-guided mechanisms—with negligible computational overhead. Our approach significantly enhances representation discriminability: it achieves +0.4% and +1.0% top-1 accuracy over MoCo-v2 and MoCHI, respectively, on ImageNet linear evaluation; attains 57.2% AP on PASCAL VOC object detection; and improves COCO object detection and instance segmentation by +1.0% and +0.8% AP, respectively. The core contribution is an efficient, scalable, and dynamic hard negative synthesis paradigm—enabling more effective unsupervised visual representation learning without additional supervision or data augmentation.

Technology Category

Application Category

📝 Abstract

Contrastive learning has become a dominant approach in self-supervised visual representation learning, but efficiently leveraging hard negatives, which are samples closely resembling the anchor, remains challenging. We introduce SynCo (Synthetic negatives in Contrastive learning), a novel approach that improves model performance by generating synthetic hard negatives on the representation space. Building on the MoCo framework, SynCo introduces six strategies for creating diverse synthetic hard negatives on-the-fly with minimal computational overhead. SynCo achieves faster training and strong representation learning, surpassing MoCo-v2 by +0.4% and MoCHI by +1.0% on ImageNet ILSVRC-2012 linear evaluation. It also transfers more effectively to detection tasks achieving strong results on PASCAL VOC detection (57.2% AP) and significantly improving over MoCo-v2 on COCO detection (+1.0% AP) and instance segmentation (+0.8% AP). Our synthetic hard negative generation approach significantly enhances visual representations learned through self-supervised contrastive learning.

Problem

Research questions and friction points this paper is trying to address.

Generates synthetic hard negatives

Improves self-supervised visual representations

Enhances model performance on detection tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates synthetic hard negatives

Enhances self-supervised contrastive learning

Improves transfer to detection tasks

🔎 Similar Papers

HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes

2024-08-11arXiv.orgCitations: 2

Bosch Group

Hildesheim, NDS, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)