Robust Learning of Multi-index Models via Iterative Subspace Approximation

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies robust learning of the multi-index model (MIM) under label noise with Gaussian covariates, aiming to efficiently recover the low-dimensional subspace governing the function’s behavior. We propose the first qualitatively optimal robust MIM learner in the statistical query (SQ) model, built upon three core techniques: iterative subspace approximation via conditional low-order moment estimation, projection-driven direction screening, and noise-resilient empirical moment analysis. Our theoretical contributions are: (1) exact subspace recovery for K-MIM; (2) O(d) sample complexity and poly(d) runtime for multi-class linear classifiers; (3) for intersections of k halfspaces, a K·Õ(OPT) + ε approximation error with polynomial (in 1/ε) sample and time complexity under label noise. This is the first work achieving near-linear error and fixed-degree polynomial-time agnostic learning for both multi-class linear classifiers and intersections of halfspaces.

Technology Category

Application Category

📝 Abstract
We study the task of learning Multi-Index Models (MIMs) with label noise under the Gaussian distribution. A $K$-MIM is any function $f$ that only depends on a $K$-dimensional subspace. We focus on well-behaved MIMs with finite ranges that satisfy certain regularity properties. Our main contribution is a general robust learner that is qualitatively optimal in the Statistical Query (SQ) model. Our algorithm iteratively constructs better approximations to the defining subspace by computing low-degree moments conditional on the projection to the subspace computed thus far, and adding directions with relatively large empirical moments. This procedure efficiently finds a subspace $V$ so that $f(mathbf{x})$ is close to a function of the projection of $mathbf{x}$ onto $V$. Conversely, for functions for which these conditional moments do not help, we prove an SQ lower bound suggesting that no efficient learner exists. As applications, we provide faster robust learners for the following concept classes: * {f Multiclass Linear Classifiers} We give a constant-factor approximate agnostic learner with sample complexity $N = O(d) 2^{mathrm{poly}(K/epsilon)}$ and computational complexity $mathrm{poly}(N ,d)$. This is the first constant-factor agnostic learner for this class whose complexity is a fixed-degree polynomial in $d$. * {f Intersections of Halfspaces} We give an approximate agnostic learner for this class achieving 0-1 error $K ilde{O}(mathrm{OPT}) + epsilon$ with sample complexity $N=O(d^2) 2^{mathrm{poly}(K/epsilon)}$ and computational complexity $mathrm{poly}(N ,d)$. This is the first agnostic learner for this class with near-linear error dependence and complexity a fixed-degree polynomial in $d$. Furthermore, we show that in the presence of random classification noise, the complexity of our algorithm scales polynomially with $1/epsilon$.
Problem

Research questions and friction points this paper is trying to address.

Learning Multi-Index Models with noise
Robust learner in Statistical Query model
Efficient subspace approximation for function projection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative subspace approximation for MIMs
Low-degree moments for subspace construction
Polynomial complexity in noisy environments
🔎 Similar Papers
No similar papers found.