Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-index Models

πŸ“… 2025-05-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper studies robust PAC learning of real-valued multi-index models (MIMs) under adversarial label noise when inputs follow a Gaussian distribution. For function classes depending on projections onto unknown $K$-dimensional subspaces, we establish the first tight statistical query (SQ) lower bound for MIM learning. We propose the first efficient algorithm for learning positively homogeneous Lipschitz MIMs, eliminating exponential dependence on network size entirely. Our approach integrates discriminative moment analysis, subspace learning, polynomial approximation, and noise-robust estimation, all within the SQ framework and with provable theoretical guarantees. The algorithm’s time complexity is $d^{O(K)} 2^{mathrm{poly}(K/varepsilon)}$; for Lipschitz homogeneous ReLU networks, it improves to $mathrm{poly}(d) cdot 2^{mathrm{poly}(KL/varepsilon)}$, significantly outperforming prior exponential-time methods.

Technology Category

Application Category

πŸ“ Abstract
We study the complexity of learning real-valued Multi-Index Models (MIMs) under the Gaussian distribution. A $K$-MIM is a function $f:mathbb{R}^d o mathbb{R}$ that depends only on the projection of its input onto a $K$-dimensional subspace. We give a general algorithm for PAC learning a broad class of MIMs with respect to the square loss, even in the presence of adversarial label noise. Moreover, we establish a nearly matching Statistical Query (SQ) lower bound, providing evidence that the complexity of our algorithm is qualitatively optimal as a function of the dimension. Specifically, we consider the class of bounded variation MIMs with the property that degree at most $m$ distinguishing moments exist with respect to projections onto any subspace. In the presence of adversarial label noise, the complexity of our learning algorithm is $d^{O(m)}2^{mathrm{poly}(K/epsilon)}$. For the realizable and independent noise settings, our algorithm incurs complexity $d^{O(m)}2^{mathrm{poly}(K)}(1/epsilon)^{O(K)}$. To complement our upper bound, we show that if for some subspace degree-$m$ distinguishing moments do not exist, then any SQ learner for the corresponding class of MIMs requires complexity $d^{Omega(m)}$. As an application, we give the first efficient learner for the class of positive-homogeneous $L$-Lipschitz $K$-MIMs. The resulting algorithm has complexity $mathrm{poly}(d) 2^{mathrm{poly}(KL/epsilon)}$. This gives a new PAC learning algorithm for Lipschitz homogeneous ReLU networks with complexity independent of the network size, removing the exponential dependence incurred in prior work.
Problem

Research questions and friction points this paper is trying to address.

Learning real-valued Multi-Index Models under Gaussian distribution
Robust PAC learning with adversarial label noise
Establishing SQ lower bounds for learning complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

PAC learning algorithm for MIMs
Handles adversarial label noise
Complexity independent of network size
πŸ”Ž Similar Papers
No similar papers found.