Understanding Aggregations of Proper Learners in Multiclass Classification

📅 2024-10-30
🏛️ International Conference on Algorithmic Learning Theory
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether the “appropriateness barrier” in multiclass classification—i.e., the limitation that proper learners cannot achieve optimal sample complexity—can be overcome via aggregation of simple proper learners. For hypothesis classes with finite graph dimension (d_G), we propose a multiclass extension of optimal binary learners (e.g., Hanneke–Larsen, Aden-Ali et al.) combined with majority voting. We establish, for the first time, that this approach achieves the tight sample complexity bound (O((d_G + ln(1/delta))/varepsilon)), strictly improving upon empirical risk minimization (ERM), which incurs an extra (ln(1/varepsilon)) factor. An information-theoretic lower bound matching this rate is provided, and we disprove universal feasibility for classes with infinite graph dimension. Crucially, we construct a learnable class that cannot be learned to constant error by aggregating any finite number of proper learners—demonstrating a fundamental limitation of proper aggregation in multiclass learning.

Technology Category

Application Category

📝 Abstract
Multiclass learnability is known to exhibit a properness barrier: there are learnable classes which cannot be learned by any proper learner. Binary classification faces no such barrier for learnability, but a similar one for optimal learning, which can in general only be achieved by improper learners. Fortunately, recent advances in binary classification have demonstrated that this requirement can be satisfied using aggregations of proper learners, some of which are strikingly simple. This raises a natural question: to what extent can simple aggregations of proper learners overcome the properness barrier in multiclass classification? We give a positive answer to this question for classes which have finite Graph dimension, $d_G$. Namely, we demonstrate that the optimal binary learners of Hanneke, Larsen, and Aden-Ali et al. (appropriately generalized to the multiclass setting) achieve sample complexity $Oleft(frac{d_G + ln(1 / delta)}{epsilon} ight)$. This forms a strict improvement upon the sample complexity of ERM. We complement this with a lower bound demonstrating that for certain classes of Graph dimension $d_G$, majorities of ERM learners require $Omega left( frac{d_G + ln(1 / delta)}{epsilon} ight)$ samples. Furthermore, we show that a single ERM requires $Omega left(frac{d_G ln(1 / epsilon) + ln(1 / delta)}{epsilon} ight)$ samples on such classes, exceeding the lower bound of Daniely et al. (2015) by a factor of $ln(1 / epsilon)$. For multiclass learning in full generality -- i.e., for classes of finite DS dimension but possibly infinite Graph dimension -- we give a strong refutation to these learning strategies, by exhibiting a learnable class which cannot be learned to constant error by any aggregation of a finite number of proper learners.
Problem

Research questions and friction points this paper is trying to address.

Overcoming properness barrier in multiclass classification
Optimal learning via aggregations of proper learners
Sample complexity bounds for multiclass learnability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aggregations of proper learners overcome multiclass barriers
Optimal sample complexity achieved via generalized binary learners
Lower bounds show ERM limitations in multiclass learning
🔎 Similar Papers
No similar papers found.