The Generative Leap: Sharp Sample Complexity for Efficiently Learning Gaussian Multi-Index Models

📅 2025-06-05

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the assumption-free, distribution-agnostic learning of an implicit low-dimensional subspace in Gaussian multi-index models: labels depend only on the projection of a $d$-dimensional Gaussian input onto an unknown $r=O(1)$-dimensional subspace. We introduce the *generative transition index* $k^star$, the first generalization of the generative index to the multi-index setting. We establish tight $Theta(d^{1 vee k^star/2})$ sample complexity bounds. We design the first sequential spectral estimation algorithm that requires no model priors, leveraging Hermite tensors, spectral $U$-statistics, and low-degree polynomial analysis. Theoretically, our method achieves optimal sample efficiency. Furthermore, we explicitly compute $k^star$ for canonical models—including ReLU networks and deep neural networks—enabling efficient, robust, and scalable subspace estimation.

Technology Category

Application Category

📝 Abstract

In this work we consider generic Gaussian Multi-index models, in which the labels only depend on the (Gaussian) $d$-dimensional inputs through their projection onto a low-dimensional $r = O_d(1)$ subspace, and we study efficient agnostic estimation procedures for this hidden subspace. We introduce the emph{generative leap} exponent $k^star$, a natural extension of the generative exponent from [Damian et al.'24] to the multi-index setting. We first show that a sample complexity of $n=Theta(d^{1 vee k/2})$ is necessary in the class of algorithms captured by the Low-Degree-Polynomial framework. We then establish that this sample complexity is also sufficient, by giving an agnostic sequential estimation procedure (that is, requiring no prior knowledge of the multi-index model) based on a spectral U-statistic over appropriate Hermite tensors. We further compute the generative leap exponent for several examples including piecewise linear functions (deep ReLU networks with bias), and general deep neural networks (with $r$-dimensional first hidden layer).

Problem

Research questions and friction points this paper is trying to address.

Efficiently learning Gaussian Multi-index models

Agnostic estimation of hidden low-dimensional subspace

Determining sample complexity for multi-index models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Spectral U-statistic for subspace estimation

Generative leap exponent for complexity

Agnostic sequential estimation procedure

🔎 Similar Papers

No similar papers found.