Diversify and Conquer: Open-Set Disagreement for Robust Semi-Supervised Learning With Outliers.

📅 2025-03-28

🏛️ IEEE Transactions on Neural Networks and Learning Systems

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Traditional semi-supervised learning (SSL) assumes that labeled and unlabeled data are drawn from the same distribution; however, in practice, unlabeled data often contain unknown-class outliers, severely degrading model performance. Existing open-set SSL approaches rely on prediction discrepancies from a single model to detect outliers, which becomes unreliable under extreme label scarcity. To address this, we propose a multi-head divergence-driven robust open-set SSL framework: within a single training pipeline, multiple prediction heads are jointly optimized to exhibit high agreement on known classes but substantial disagreement on unknown-class samples. Leveraging divergence regularization, consistency constraints, and uncertainty-aware pseudo-labeling, our method enables automatic identification and suppression of outliers. Extensive experiments across diverse open-set protocols demonstrate that our approach significantly outperforms state-of-the-art methods, achieving both high accuracy and strong robustness—even with minimal labeled data.

Technology Category

Application Category

📝 Abstract

Conventional semi-supervised learning (SSL) ideally assumes that labeled and unlabeled data share an identical class distribution; however, in practice, this assumption is easily violated, as unlabeled data often includes unknown class data, i.e., outliers. The outliers are treated as noise, considerably degrading the performance of SSL models. To address this drawback, we propose a novel framework, diversify and conquer (DAC), to enhance SSL robustness in the context of open-set SSL (OSSL). In particular, we note that existing OSSL methods rely on prediction discrepancies between inliers and outliers from a single model trained on labeled data. This approach can be easily failed when the labeled data are insufficient, leading to performance degradation that is worse than naive SSL that do not account for outliers. In contrast, our approach exploits prediction disagreements among multiple models that are differently biased toward the unlabeled distribution. By leveraging the discrepancies arising from training on unlabeled data, our method enables robust outlier detection, even when the labeled data are underspecified. Our key contribution is constructing a collection of differently biased models through a single training process. By encouraging divergent heads to be differently biased toward outliers while making consistent predictions for inliers, we exploit the disagreement among these heads as a measure to identify unknown concepts. Extensive experiments demonstrate that our method significantly surpasses state-of-the-art OSSL methods across various protocols.

Problem

Research questions and friction points this paper is trying to address.

Addresses SSL performance degradation from unlabeled outliers

Proposes multi-model disagreement for robust outlier detection

Enhances open-set SSL with divergent biased heads

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiple models with different biases

Disagreement among heads for outlier detection

Single training for divergent heads

🔎 Similar Papers

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique