🤖 AI Summary
This paper investigates the minimax learning rates of binary classifiers under geometric margin constraints. We develop a Lebesgue-norm analytical framework constrained by Kolmogorov entropy, integrating minimax theory with geometric margin modeling, for both noiseless and noisy settings—specifically, those with level-set-type decision boundaries. Our contributions are threefold: (i) For the noiseless case, we establish the first tight lower bound on the optimal learning rate; (ii) Under Barron regularity or Hölder continuity assumptions on the decision boundary, we achieve nearly optimal $O(n^{-1})$ convergence rates; (iii) For multiple function classes—including convex boundaries—we derive matching upper and lower bounds, covering the full spectrum of rates from $O(n^{-1})$ under strong margin to $O(n^{-1/2})$ in weaker regimes. The key innovation lies in resolving the long-standing challenge of deriving tight noiseless lower bounds under geometric margin, while unifying the impact of noise structure and functional regularity on learning rates.
📝 Abstract
We study classification problems using binary estimators where the decision boundary is described by horizon functions and where the data distribution satisfies a geometric margin condition. We establish upper and lower bounds for the minimax learning rate over broad function classes with bounded Kolmogorov entropy in Lebesgue norms. A key novelty of our work is the derivation of lower bounds on the worst-case learning rates under a geometric margin condition -- a setting that is almost universally satisfied in practice but remains theoretically challenging. Moreover, our results deal with the noiseless setting, where lower bounds are particularly hard to establish. We apply our general results to classification problems with decision boundaries belonging to several function classes: for Barron-regular functions, and for H""older-continuous functions with strong margins, we identify optimal rates close to the fast learning rates of $mathcal{O}(n^{-1})$ for $n in mathbb{N}$ samples. Also for merely convex decision boundaries, in a strong margin case optimal rates near $mathcal{O}(n^{-1/2})$ can be achieved.