🤖 AI Summary
This work investigates the learnability of high-dimensional single- and multi-index models under random bias distributions. Addressing the prohibitively high sample complexity (e.g., Ω(d²)) incurred by conventional isotropic assumptions, we introduce a random mean-shift modeling framework. Leveraging high-dimensional probability theory, information-theoretic and generative exponential-family analysis, and structural characterization via Juntas, we rigorously establish that any Gaussian single-index model becomes learnable with O(d) samples under arbitrarily small random bias—matching the sample complexity of linear models. This “degeneracy effect” reveals, for the first time, a universal simplification of otherwise intractable learning tasks under random perturbations, fundamentally challenging established notions of learning hardness. We further extend this result to sparse Boolean k-Juntas, achieving efficient learnability.
📝 Abstract
The problem of learning single index and multi index models has gained significant interest as a fundamental task in high-dimensional statistics. Many recent works have analysed gradient-based methods, particularly in the setting of isotropic data distributions, often in the context of neural network training. Such studies have uncovered precise characterisations of algorithmic sample complexity in terms of certain analytic properties of the target function, such as the leap, information, and generative exponents. These properties establish a quantitative separation between low and high complexity learning tasks. In this work, we show that high complexity cases are rare. Specifically, we prove that introducing a small random perturbation to the data distribution--via a random shift in the first moment--renders any Gaussian single index model as easy to learn as a linear function. We further extend this result to a class of multi index models, namely sparse Boolean functions, also known as Juntas.