Revisiting semi-supervised learning in the era of foundation models

📅 2025-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
With vision foundation models (VFMs) dominating modern computer vision, the synergy between traditional semi-supervised learning (SSL) and pre-trained models remains poorly understood. Method: We identify that parameter-efficient fine-tuning (PEFT) achieves performance on par with—or even surpassing—that of classical SSL methods using only minimal labeled data, suggesting a paradigm shift for SSL in the VFM era. To this end, we propose a novel self-training framework integrating multiple PEFT modules with diverse VFM backbones (e.g., ViT, CLIP), leveraging ensemble-based pseudo-label generation to significantly suppress label noise. Contribution/Results: Evaluated on a newly constructed SSL benchmark, our approach achieves state-of-the-art performance at extremely low labeling costs (e.g., 1% labeled data), validating a lightweight, scalable, and assumption-light SSL paradigm—eliminating the need for strong distributional assumptions about unlabeled data.

Technology Category

Application Category

📝 Abstract
Semi-supervised learning (SSL) leverages abundant unlabeled data alongside limited labeled data to enhance learning. As vision foundation models (VFMs) increasingly serve as the backbone of vision applications, it remains unclear how SSL interacts with these pre-trained models. To address this gap, we develop new SSL benchmark datasets where frozen VFMs underperform and systematically evaluate representative SSL methods. We make a surprising observation: parameter-efficient fine-tuning (PEFT) using only labeled data often matches SSL performance, even without leveraging unlabeled data. This motivates us to revisit self-training, a conceptually simple SSL baseline, where we use the supervised PEFT model to pseudo-label unlabeled data for further training. To overcome the notorious issue of noisy pseudo-labels, we propose ensembling multiple PEFT approaches and VFM backbones to produce more robust pseudo-labels. Empirical results validate the effectiveness of this simple yet powerful approach, providing actionable insights into SSL with VFMs and paving the way for more scalable and practical semi-supervised learning in the era of foundation models.
Problem

Research questions and friction points this paper is trying to address.

Explores SSL interaction with pre-trained vision foundation models.
Evaluates SSL methods on new benchmark datasets with VFMs.
Proposes ensembling PEFT approaches to improve pseudo-label robustness.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-efficient fine-tuning matches SSL performance
Self-training with PEFT for pseudo-labeling unlabeled data
Ensembling PEFT approaches to improve pseudo-label robustness
🔎 Similar Papers
No similar papers found.
P
Ping Zhang
Department of Computer Science and Engineering, The Ohio State University
Zheda Mai
Zheda Mai
Ohio State University
Continual LearningParameter Efficient Fine TuningVision Foundation Models
Quang-Huy Nguyen
Quang-Huy Nguyen
Undergraduate Student, VNU University of Engineering and Technology
Recommender SystemsTrusthworthy AILarge Language Models
W
Wei-Lun Chao
Department of Computer Science and Engineering, The Ohio State University