Active Learning of General Halfspaces: Label Queries vs Membership Queries

📅 2024-12-31

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper studies active learning of general halfspaces under Gaussian distributions, focusing on sample-efficiency bottlenecks arising from data skewness and agnostic noise. We propose an adaptive membership-query-based algorithm—the first to achieve efficient learning in the agnostic setting. Crucially, we prove that label queries cannot surpass passive learning in polynomial-sized sample pools, establishing a fundamental separation between membership and label querying paradigms. Technically, our approach integrates geometric characterizations of Gaussian space, information-theoretic lower-bound analysis, bias-*p*-driven query selection, and a novel decomposition of agnostic error. Our algorithm achieves optimal query complexity $ ilde{O}(min{1/p, 1/epsilon} + d cdot mathrm{polylog}(1/epsilon))$ and classification error $O(mathrm{opt}) + epsilon$, strictly improving upon the proven lower bound for label queries.

Technology Category

Application Category

📝 Abstract

We study the problem of learning general (i.e., not necessarily homogeneous) halfspaces under the Gaussian distribution on $R^d$ in the presence of some form of query access. In the classical pool-based active learning model, where the algorithm is allowed to make adaptive label queries to previously sampled points, we establish a strong information-theoretic lower bound ruling out non-trivial improvements over the passive setting. Specifically, we show that any active learner requires label complexity of $ ilde{Omega}(d/(log(m)epsilon))$, where $m$ is the number of unlabeled examples. Specifically, to beat the passive label complexity of $ ilde{O} (d/epsilon)$, an active learner requires a pool of $2^{poly(d)}$ unlabeled samples. On the positive side, we show that this lower bound can be circumvented with membership query access, even in the agnostic model. Specifically, we give a computationally efficient learner with query complexity of $ ilde{O}(min{1/p, 1/epsilon} + dcdot polylog(1/epsilon))$ achieving error guarantee of $O(opt)+epsilon$. Here $p in [0, 1/2]$ is the bias and $opt$ is the 0-1 loss of the optimal halfspace. As a corollary, we obtain a strong separation between the active and membership query models. Taken together, our results characterize the complexity of learning general halfspaces under Gaussian marginals in these models.

Problem

Research questions and friction points this paper is trying to address.

Active Learning

Spatial Partitioning

Gaussian Distributions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Distribution

Membership Queries

Efficient Learning

🔎 Similar Papers

No similar papers found.

Authors to Follow