Mixed Semi-Supervised Generalized-Linear-Regression with applications to Deep learning

📅 2023-02-19
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
This paper addresses the challenge of effectively leveraging unlabeled data in semi-supervised learning (SSL) for regression tasks. We propose a hybrid semi-supervised generalized linear regression framework that introduces a learnable mixing parameter α to weight the empirical risks from labeled and unlabeled data, enabling end-to-end optimization. Theoretically, we establish—for the first time—that any positive α guarantees strict improvement in prediction performance via unlabeled data, and we develop an estimator for the optimal α. The framework seamlessly extends to deep neural networks. Extensive experiments on synthetic and real-world regression benchmarks demonstrate consistent and significant improvements over fully supervised baselines; substantial and stable gains are also observed in deep learning settings. Our core contributions are threefold: (i) a theoretical proof of the universal performance gain from unlabeled data in regression SSL; (ii) a learnable hybrid risk formulation; and (iii) a unified, end-to-end SSL regression framework applicable across model classes.
📝 Abstract
We present a methodology for using unlabeled data to design semi-supervised learning (SSL) methods that improve the predictive performance of supervised learning for regression tasks. The main idea is to design different mechanisms for integrating the unlabeled data, and include in each of them a mixing parameter $alpha$, controlling the weight given to the unlabeled data. Focusing on Generalized Linear Models (GLM) and linear interpolators classes of models, we analyze the characteristics of different mixing mechanisms, and prove that it is consistently beneficial to integrate the unlabeled data with some nonzero mixing ratio $alpha>0$, in terms of predictive performance. Moreover, we provide a rigorous framework to estimate the best mixing ratio where mixed-SSL delivers the best predictive performance, while using the labeled and unlabeled data on hand. The effectiveness of our methodology in delivering substantial improvement compared to the standard supervised models, in a variety of settings, is demonstrated empirically through extensive simulation, providing empirical support for our theoretical analysis. We also demonstrate the applicability of our methodology (with some heuristic modifications) to improve more complex models, such as deep neural networks, in real-world regression tasks
Problem

Research questions and friction points this paper is trying to address.

Developing semi-supervised regression methods using unlabeled data
Optimizing mixing parameters to enhance supervised learning performance
Applying methodology to GLM, linear interpolators, and deep neural networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixed semi-supervised learning with parameter α
Optimized mixing ratio for unlabeled data integration
Generalized Linear Models and deep learning applications