Active Learning of General Halfspaces: Label Queries vs Membership Queries

๐Ÿ“… 2024-12-31
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper studies active learning of general halfspaces under Gaussian distributions, focusing on sample-efficiency bottlenecks arising from data skewness and agnostic noise. We propose an adaptive membership-query-based algorithmโ€”the first to achieve efficient learning in the agnostic setting. Crucially, we prove that label queries cannot surpass passive learning in polynomial-sized sample pools, establishing a fundamental separation between membership and label querying paradigms. Technically, our approach integrates geometric characterizations of Gaussian space, information-theoretic lower-bound analysis, bias-*p*-driven query selection, and a novel decomposition of agnostic error. Our algorithm achieves optimal query complexity $ ilde{O}(min{1/p, 1/epsilon} + d cdot mathrm{polylog}(1/epsilon))$ and classification error $O(mathrm{opt}) + epsilon$, strictly improving upon the proven lower bound for label queries.

Technology Category

Application Category

๐Ÿ“ Abstract
We study the problem of learning general (i.e., not necessarily homogeneous) halfspaces under the Gaussian distribution on $R^d$ in the presence of some form of query access. In the classical pool-based active learning model, where the algorithm is allowed to make adaptive label queries to previously sampled points, we establish a strong information-theoretic lower bound ruling out non-trivial improvements over the passive setting. Specifically, we show that any active learner requires label complexity of $ ilde{Omega}(d/(log(m)epsilon))$, where $m$ is the number of unlabeled examples. Specifically, to beat the passive label complexity of $ ilde{O} (d/epsilon)$, an active learner requires a pool of $2^{poly(d)}$ unlabeled samples. On the positive side, we show that this lower bound can be circumvented with membership query access, even in the agnostic model. Specifically, we give a computationally efficient learner with query complexity of $ ilde{O}(min{1/p, 1/epsilon} + dcdot polylog(1/epsilon))$ achieving error guarantee of $O(opt)+epsilon$. Here $p in [0, 1/2]$ is the bias and $opt$ is the 0-1 loss of the optimal halfspace. As a corollary, we obtain a strong separation between the active and membership query models. Taken together, our results characterize the complexity of learning general halfspaces under Gaussian marginals in these models.
Problem

Research questions and friction points this paper is trying to address.

Active Learning
Spatial Partitioning
Gaussian Distributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Distribution
Membership Queries
Efficient Learning
๐Ÿ”Ž Similar Papers
No similar papers found.