IConE: Batch Independent Collapse Prevention for Self-Supervised Representation Learning

📅 2026-03-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the issue of representation collapse in existing self-supervised learning methods, which rely heavily on within-batch interactions and are particularly vulnerable under small-batch training—especially when applied to high-dimensional scientific data constrained by memory limitations and class imbalance. To overcome this, the authors propose IConE, a novel framework that decouples collapse prevention from batch-level statistics by introducing globally learnable auxiliary instance embeddings and explicit diversity regularization. This enables stable training without dependence on batch statistics, supporting batch sizes as small as one. IConE consistently outperforms state-of-the-art contrastive and non-contrastive methods across diverse 2D and 3D biomedical datasets, maintaining high intrinsic dimensionality and robustness even under extremely small batch settings (B=1–64), thereby effectively mitigating representation collapse.

Technology Category

Application Category

📝 Abstract
Self-supervised learning (SSL) has revolutionized representation learning, with Joint-Embedding Architectures (JEAs) emerging as an effective approach for capturing semantic features. Existing JEAs rely on implicit or explicit batch interaction -- via negative sampling or statistical regularization -- to prevent representation collapse. This reliance becomes problematic in regimes where batch sizes must be small, such as high-dimensional scientific data, where memory constraints and class imbalance make large, well-balanced batches infeasible. We introduce IConE (Instance-Contrasted Embeddings), a framework that decouples collapse prevention from the training batch size. Rather than enforcing diversity through batch statistics, IConE maintains a global set of learnable auxiliary instance embeddings regularized by an explicit diversity objective. This transfers the anti-collapse mechanism from the transient batch to a dataset-level embedding space, allowing stable training even when batch statistics are unreliable, down to batch size 1. Across diverse 2D and 3D biomedical modalities, IConE outperforms strong contrastive and non-contrastive baselines throughout the small-batch regime (from B=1 to B=64) and demonstrates marked robustness to severe class imbalance. Geometric analysis shows that IConE preserves high intrinsic dimensionality in the learned representations, preventing the collapse observed in existing JEAs as batch sizes shrink.
Problem

Research questions and friction points this paper is trying to address.

self-supervised learning
representation collapse
batch size
class imbalance
Joint-Embedding Architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

IConE
self-supervised learning
representation collapse
instance-contrasted embeddings
small-batch training
🔎 Similar Papers
2024-03-07arXiv.orgCitations: 2
K
Konstantinos Almpanakis
European Molecular Biology Laboratory, Heidelberg, Germany
Anna Kreshuk
Anna Kreshuk
Group Leader, EMBL Heidelberg
Machine LearningComputer VisionBiomedical Image Analysis