A Near-optimal Algorithm for Learning Margin Halfspaces with Massart Noise

📅 2025-01-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies the PAC learning of γ-margin halfspaces under Massart noise, aiming to achieve information-theoretically optimal sample complexity with computational efficiency. We propose an algorithm based on online stochastic gradient descent (SGD) that optimizes a carefully designed sequence of convex surrogate loss functions. Our method attains 0–1 error η + ε (where η < 1/2) using only $ ilde{Theta}(1/(gamma^2varepsilon^2))$ labeled examples—breaking the previous best-known bound of $ ilde{O}(1/(gamma^4varepsilon^3))$. This sample complexity matches the information-theoretic lower bound up to logarithmic factors, establishing statistical optimality. Moreover, the algorithm runs in linear time and is readily implementable, bridging the gap between statistical limits and practical feasibility. Our result reveals a fundamental trade-off between statistical robustness and computational tractability in noisy learning, providing the first computationally efficient learner for Massart halfspaces that achieves optimal sample complexity.

Technology Category

Application Category

📝 Abstract
We study the problem of PAC learning $gamma$-margin halfspaces in the presence of Massart noise. Without computational considerations, the sample complexity of this learning problem is known to be $widetilde{Theta}(1/(gamma^2 epsilon))$. Prior computationally efficient algorithms for the problem incur sample complexity $ ilde{O}(1/(gamma^4 epsilon^3))$ and achieve 0-1 error of $eta+epsilon$, where $eta<1/2$ is the upper bound on the noise rate. Recent work gave evidence of an information-computation tradeoff, suggesting that a quadratic dependence on $1/epsilon$ is required for computationally efficient algorithms. Our main result is a computationally efficient learner with sample complexity $widetilde{Theta}(1/(gamma^2 epsilon^2))$, nearly matching this lower bound. In addition, our algorithm is simple and practical, relying on online SGD on a carefully selected sequence of convex losses.
Problem

Research questions and friction points this paper is trying to address.

Massart noise
efficient learning
gamma-margin hyperplane
Innovation

Methods, ideas, or system contributions that make the work stand out.

Massart Noise
Gamma-Margin Hyperplanes
Efficient Learning
🔎 Similar Papers
No similar papers found.