Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

243K/year

🤖 AI Summary

This work investigates the learnability of high-dimensional single- and multi-index models under random bias distributions. Addressing the prohibitively high sample complexity (e.g., Ω(d²)) incurred by conventional isotropic assumptions, we introduce a random mean-shift modeling framework. Leveraging high-dimensional probability theory, information-theoretic and generative exponential-family analysis, and structural characterization via Juntas, we rigorously establish that any Gaussian single-index model becomes learnable with O(d) samples under arbitrarily small random bias—matching the sample complexity of linear models. This “degeneracy effect” reveals, for the first time, a universal simplification of otherwise intractable learning tasks under random perturbations, fundamentally challenging established notions of learning hardness. We further extend this result to sparse Boolean k-Juntas, achieving efficient learnability.

Technology Category

Application Category

📝 Abstract

The problem of learning single index and multi index models has gained significant interest as a fundamental task in high-dimensional statistics. Many recent works have analysed gradient-based methods, particularly in the setting of isotropic data distributions, often in the context of neural network training. Such studies have uncovered precise characterisations of algorithmic sample complexity in terms of certain analytic properties of the target function, such as the leap, information, and generative exponents. These properties establish a quantitative separation between low and high complexity learning tasks. In this work, we show that high complexity cases are rare. Specifically, we prove that introducing a small random perturbation to the data distribution--via a random shift in the first moment--renders any Gaussian single index model as easy to learn as a linear function. We further extend this result to a class of multi index models, namely sparse Boolean functions, also known as Juntas.

Problem

Research questions and friction points this paper is trying to address.

Learn single and multi index models efficiently

Analyze gradient-based methods on isotropic data

Introduce random perturbation to simplify learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Random perturbation simplifies learning

Shift in first moment reduces complexity

Efficient learning of sparse Boolean functions

🔎 Similar Papers

Compelling ReLU Networks to Exhibit Exponentially Many Linear Regions at Initialization and During Training