🤖 AI Summary
To address the degradation of model performance in semi-supervised text classification caused by high noise in pseudo-labels and extreme class imbalance, this paper proposes a multi-head consistency regularization matching framework. The core innovation is a novel triple pseudo-label weighting module that integrates multi-head collaborative training, adaptive dynamic thresholding for pseudo-label selection, and average pseudo-boundary difficulty-aware weighting—enabling robust pseudo-label selection, noise filtering, and discriminative weighting. This unified framework enhances both generalization and robustness under noisy and long-tailed label distributions. Extensive experiments across five standard NLP benchmarks and ten imbalance settings demonstrate state-of-the-art (SOTA) performance in nine cases; Friedman test ranking confirms top overall performance; and under highly imbalanced scenarios, the method achieves an average improvement of 3.26% over the second-best approach.
📝 Abstract
We introduce MultiMatch, a novel semi-supervised learning (SSL) algorithm combining the paradigms of co-training and consistency regularization with pseudo-labeling. At its core, MultiMatch features a three-fold pseudo-label weighting module designed for three key purposes: selecting and filtering pseudo-labels based on head agreement and model confidence, and weighting them according to the perceived classification difficulty. This novel module enhances and unifies three existing techniques -- heads agreement from Multihead Co-training, self-adaptive thresholds from FreeMatch, and Average Pseudo-Margins from MarginMatch -- resulting in a holistic approach that improves robustness and performance in SSL settings. Experimental results on benchmark datasets highlight the superior performance of MultiMatch, achieving state-of-the-art results on 9 out of 10 setups from 5 natural language processing datasets and ranking first according to the Friedman test among 19 methods. Furthermore, MultiMatch demonstrates exceptional robustness in highly imbalanced settings, outperforming the second-best approach by 3.26% -- and data imbalance is a key factor for many text classification tasks.