Multi-level Supervised Contrastive Learning

📅 2025-02-04

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

Existing contrastive learning approaches typically employ a single projection head, limiting their ability to model fine-grained semantic similarities among samples with multi-label and hierarchical label structures—especially under few-shot settings. To address this, we propose Multi-level Supervised Contrastive Learning (MSCL), a novel framework featuring multiple dedicated nonlinear projection heads, each explicitly designed to capture inter-label and inter-hierarchical similarities. MSCL introduces a hierarchy-aware positive/negative sample construction strategy and jointly optimizes multiple supervised contrastive losses. Notably, this is the first work to explicitly integrate multi-level semantic supervision into the contrastive learning architecture. Extensive experiments on text and image multi-label and hierarchical classification benchmarks demonstrate substantial improvements over state-of-the-art methods, with particularly pronounced gains under low-resource conditions—achieving average accuracy improvements of +3.2% to +5.8%.

Technology Category

Application Category

📝 Abstract

Contrastive learning is a well-established paradigm in representation learning. The standard framework of contrastive learning minimizes the distance between"similar"instances and maximizes the distance between dissimilar ones in the projection space, disregarding the various aspects of similarity that can exist between two samples. Current methods rely on a single projection head, which fails to capture the full complexity of different aspects of a sample, leading to suboptimal performance, especially in scenarios with limited training data. In this paper, we present a novel supervised contrastive learning method in a unified framework called multilevel contrastive learning (MLCL), that can be applied to both multi-label and hierarchical classification tasks. The key strength of the proposed method is the ability to capture similarities between samples across different labels and/or hierarchies using multiple projection heads. Extensive experiments on text and image datasets demonstrate that the proposed approach outperforms state-of-the-art contrastive learning methods

Problem

Research questions and friction points this paper is trying to address.

Enhance similarity capture in contrastive learning

Address limitations of single projection head

Improve performance in multi-label classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-level supervised contrastive learning

Multiple projection heads used

Captures cross-label hierarchical similarities

🔎 Similar Papers

No similar papers found.