Self-Soupervision: Cooking Model Soups without Labels

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a novel model fusion framework for self-supervised learning that operates without labeled data, addressing the limitations of existing fusion methods that rely on supervision and ground-truth labels—particularly their inapplicability in label-scarce settings and limited robustness under distribution shifts. By training diverse self-supervised components (e.g., MAE, MoCoV3, MMCR) on unlabeled data and fusing them via parameter interpolation, the approach enables flexible integration of models with heterogeneous architectures and hyperparameter configurations. Robustness is further enhanced through evaluation on corrupted test sets. Empirical results demonstrate that the fused model achieves significant improvements, yielding +3.5% and +7% robustness gains over the best individual component on ImageNet-C and LAION-C benchmarks, respectively, thereby establishing a new paradigm for unsupervised model ensembling.

Technology Category

Application Category

📝 Abstract
Model soups are strange and strangely effective combinations of parameters. They take a model (the stock), fine-tune it into multiple models (the ingredients), and then mix their parameters back into one model (the soup) to improve predictions. While all known soups require supervised learning, and optimize the same loss on labeled data, our recipes for Self-\emph{Soup}ervision generalize soups to self-supervised learning (SSL). Our Self-Souping lets us flavor ingredients on new data sources, e.g. from unlabeled data from a task for transfer or from a shift for robustness. We show that Self-Souping on corrupted test data, then fine-tuning back on uncorrupted train data, boosts robustness by +3.5\% (ImageNet-C) and +7\% (LAION-C). Self-\emph{Soup}ervision also unlocks countless SSL algorithms to cook the diverse ingredients needed for more robust soups. We show for the first time that ingredients can differ in their SSL hyperparameters -- and more surprisingly, in their SSL algorithms. We cook soups of MAE, MoCoV3, and MMCR ingredients that are more accurate than any one single SSL ingredient.
Problem

Research questions and friction points this paper is trying to address.

model soups
self-supervised learning
robustness
transfer learning
unlabeled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Soupervision
model soups
self-supervised learning
robustness
parameter averaging
🔎 Similar Papers
No similar papers found.
A
A. Fuller
Carleton University, Ottawa, Canada; Vector Institute, Toronto, Canada
J
James R. Green
Carleton University, Ottawa, Canada
Evan Shelhamer
Evan Shelhamer
UBC / Vector Institute / CIFAR AI Chair
computer visionmachine learningdeep learning