Beyond Augmentation: Leveraging Inter-Instance Relation in Self-Supervised Representation Learning

📅 2025-10-25

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

Traditional self-supervised learning overly relies on data augmentation while neglecting semantic relationships among instances. To address this, we propose the first systematic incorporation of graph-structured modeling to explicitly capture inter-instance associations. Specifically, we construct a teacher–student dual-stream k-nearest neighbor (k-NN) graph and integrate graph neural networks (GNNs) to enable multi-hop message passing, thereby unifying local augmentations with global contextual information. Our approach introduces a k-NN-based dual-stream architecture coupled with a representation refinement mechanism, breaking away from the conventional paradigm that learns solely from intra-instance variations. Extensive experiments demonstrate consistent improvements in linear evaluation accuracy: +7.3% on CIFAR-10, +3.2% on ImageNet-100, and +1.0% on ImageNet-1K—outperforming state-of-the-art methods. These results validate the effectiveness and generalizability of explicitly modeling inter-instance relationships for self-supervised representation learning.

Technology Category

Application Category

📝 Abstract

This paper introduces a novel approach that integrates graph theory into self-supervised representation learning. Traditional methods focus on intra-instance variations generated by applying augmentations. However, they often overlook important inter-instance relationships. While our method retains the intra-instance property, it further captures inter-instance relationships by constructing k-nearest neighbor (KNN) graphs for both teacher and student streams during pretraining. In these graphs, nodes represent samples along with their latent representations. Edges encode the similarity between instances. Following pretraining, a representation refinement phase is performed. In this phase, Graph Neural Networks (GNNs) propagate messages not only among immediate neighbors but also across multiple hops, thereby enabling broader contextual integration. Experimental results on CIFAR-10, ImageNet-100, and ImageNet-1K demonstrate accuracy improvements of 7.3%, 3.2%, and 1.0%, respectively, over state-of-the-art methods. These results highlight the effectiveness of the proposed graph based mechanism. The code is publicly available at https://github.com/alijavidani/SSL-GraphNNCLR.

Problem

Research questions and friction points this paper is trying to address.

Capturing inter-instance relationships in self-supervised learning

Integrating graph theory to model sample similarity

Refining representations using multi-hop graph neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates graph theory into self-supervised representation learning

Constructs KNN graphs capturing inter-instance relationships during pretraining

Uses GNNs for multi-hop message propagation in refinement phase

🔎 Similar Papers

A Survey of the Self Supervised Learning Mechanisms for Vision Transformers