Polynomial-Time Robust Multiclass Linear Classification under Gaussian Marginals

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This work addresses the agnostic learning problem for multiclass linear classifiers under Gaussian marginal distributions and presents the first dimension-independent, polynomial-time robust algorithm, resolving a long-standing challenge in settings with three or more classes. By integrating pairwise improper learning with a localization-based analytical framework and leveraging structural properties of the Gaussian distribution, the method overcomes the fundamental barrier of super-polynomial sample complexity that plagues standard multiclass perceptron algorithms in this setting. Theoretical guarantees include an error bound of $\widetilde{O}(k^{3/2}\sqrt{\text{opt}}) + \varepsilon$ for general $k$-class problems, an $O(\text{opt}) + \varepsilon$ guarantee for $k = 3$, and a $\text{poly}(k) \cdot \text{opt} + \varepsilon$ approximation for geometrically regular $k$-class classifiers.

📝 Abstract

We study the task of agnostic learning of multiclass linear classifiers under the Gaussian distribution. Given labeled examples $(x, y)$ from a distribution over $\mathbb{R}^d \times [k]$, with Gaussian $x$-marginal, the goal is to output a hypothesis whose error is comparable to that of the best $k$-class linear classifier. While the binary case $k=2$ has a well-developed algorithmic theory, much less is known for $k \ge 3$. Even for $k=3$, prior robust algorithms incur exponential dependence on the inverse of the desired accuracy in both complexity and representation size. In this work, we develop new structural results for multiclass linear classifiers and use them to design fully polynomial-time robust learners with dimension-independent error guarantees. Our first result shows that the standard multiclass perceptron algorithm requires super-polynomially many samples and updates, even with clean labels and Gaussian marginals, revealing a basic obstruction absent in the binary case. Our main positive result is a pairwise improper-learning framework which yields an efficient learner with error $\widetilde O(k^{3/2}\sqrt{\mathrm{opt}})+ε$ for general $k$. Additionally, we develop a sharper localization-based framework which leads to error $O(\mathrm{opt})+ε$ for $k=3$, and error $\mathrm{poly}(k)\mathrm{opt}+ε$ for geometrically regular $k$-class linear classifiers.

Problem

Research questions and friction points this paper is trying to address.

multiclass classification

agnostic learning

Gaussian marginals

robust learning

polynomial-time algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

multiclass linear classification

agnostic learning

Gaussian marginals

polynomial-time algorithm

improper learning

🔎 Similar Papers

Robust Twin Parametric Margin Support Vector Machine for Multiclass Classification

2023-06-09arXiv.orgCitations: 4

Building a stable classifier with the inflated argmax

2024-05-22Neural Information Processing SystemsCitations: 2

💼 Related Jobs

Machine Learning Engineer