Supervised Models Can Generalize Also When Trained on Random Label

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work investigates whether supervised learning models can be trained and generalized without access to any ground-truth labels $y$. To this end, we propose the *$y$-free smooth operator* paradigm, which constructs models as smooth mappings $S(x)$ dependent solely on input features $x$, enabling fully label-free training. We provide the first theoretical guarantee that supervised models can achieve effective training without ground-truth labels. Furthermore, we introduce an unsupervised model selection criterion based on predictive distribution consistency, circumventing the conventional reliance on labeled data for cross-validation. Empirical evaluation on synthetic and real-world datasets demonstrates that linear/kernel ridge regression, spline smoothing, and neural networks—trained exclusively on random (i.e., meaningless) labels—achieve performance comparable to standard supervised learning and substantially surpass random guessing. These results empirically validate the core finding: ground-truth labels are not strictly necessary for effective supervised model training and generalization.

Technology Category

Application Category

📝 Abstract

The success of unsupervised learning raises the question of whether also supervised models can be trained without using the information in the output $y$. In this paper, we demonstrate that this is indeed possible. The key step is to formulate the model as a smoother, i.e. on the form $hat{f}=Sy$, and to construct the smoother matrix $S$ independently of $y$, e.g. by training on random labels. We present a simple model selection criterion based on the distribution of the out-of-sample predictions and show that, in contrast to cross-validation, this criterion can be used also without access to $y$. We demonstrate on real and synthetic data that $y$-free trained versions of linear and kernel ridge regression, smoothing splines, and neural networks perform similarly to their standard, $y$-based, versions and, most importantly, significantly better than random guessing.

Problem

Research questions and friction points this paper is trying to address.

Can supervised models generalize with random labels

Construct smoother matrix independently of output y

Evaluate performance of y-free trained models vs standard

Innovation

Methods, ideas, or system contributions that make the work stand out.

Train models as smoothers with random labels

Construct smoother matrix independently of output

Use out-of-sample predictions for model selection

🔎 Similar Papers

No similar papers found.