The Generative Leap: Sharp Sample Complexity for Efficiently Learning Gaussian Multi-Index Models

📅 2025-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the assumption-free, distribution-agnostic learning of an implicit low-dimensional subspace in Gaussian multi-index models: labels depend only on the projection of a $d$-dimensional Gaussian input onto an unknown $r=O(1)$-dimensional subspace. We introduce the *generative transition index* $k^star$, the first generalization of the generative index to the multi-index setting. We establish tight $Theta(d^{1 vee k^star/2})$ sample complexity bounds. We design the first sequential spectral estimation algorithm that requires no model priors, leveraging Hermite tensors, spectral $U$-statistics, and low-degree polynomial analysis. Theoretically, our method achieves optimal sample efficiency. Furthermore, we explicitly compute $k^star$ for canonical models—including ReLU networks and deep neural networks—enabling efficient, robust, and scalable subspace estimation.

Technology Category

Application Category

📝 Abstract
In this work we consider generic Gaussian Multi-index models, in which the labels only depend on the (Gaussian) $d$-dimensional inputs through their projection onto a low-dimensional $r = O_d(1)$ subspace, and we study efficient agnostic estimation procedures for this hidden subspace. We introduce the emph{generative leap} exponent $k^star$, a natural extension of the generative exponent from [Damian et al.'24] to the multi-index setting. We first show that a sample complexity of $n=Theta(d^{1 vee k/2})$ is necessary in the class of algorithms captured by the Low-Degree-Polynomial framework. We then establish that this sample complexity is also sufficient, by giving an agnostic sequential estimation procedure (that is, requiring no prior knowledge of the multi-index model) based on a spectral U-statistic over appropriate Hermite tensors. We further compute the generative leap exponent for several examples including piecewise linear functions (deep ReLU networks with bias), and general deep neural networks (with $r$-dimensional first hidden layer).
Problem

Research questions and friction points this paper is trying to address.

Efficiently learning Gaussian Multi-index models
Agnostic estimation of hidden low-dimensional subspace
Determining sample complexity for multi-index models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spectral U-statistic for subspace estimation
Generative leap exponent for complexity
Agnostic sequential estimation procedure
🔎 Similar Papers
No similar papers found.