Near-Polynomially Competitive Active Logistic Regression

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates label query efficiency for active logistic regression under the realizable setting, aiming to achieve near-optimal generalization performance with significantly fewer labeled examples than passive learning. We propose the first active learning algorithm for logistic regression that attains a near-polynomial competitive ratio—i.e., label complexity differing from the optimal only by a polylog(1/ε) factor—under arbitrary input distributions. Crucially, our method reduces the label complexity to logarithmic order O(log(1/ε)), surpassing prior approaches that incur polynomial dependence on 1/ε. The algorithm employs an adaptive, computationally efficient sampling strategy, generalizable to broader function classes. We provide tight theoretical analysis establishing the optimality of its label complexity. Empirical evaluation on standard logistic regression benchmarks demonstrates substantial improvements over state-of-the-art active learning baselines.

Technology Category

Application Category

📝 Abstract
We address the problem of active logistic regression in the realizable setting. It is well known that active learning can require exponentially fewer label queries compared to passive learning, in some cases using $log frac{1}{eps}$ rather than $poly(1/eps)$ labels to get error $eps$ larger than the optimum. We present the first algorithm that is polynomially competitive with the optimal algorithm on every input instance, up to factors polylogarithmic in the error and domain size. In particular, if any algorithm achieves label complexity polylogarithmic in $eps$, so does ours. Our algorithm is based on efficient sampling and can be extended to learn more general class of functions. We further support our theoretical results with experiments demonstrating performance gains for logistic regression compared to existing active learning algorithms.
Problem

Research questions and friction points this paper is trying to address.

Active logistic regression in realizable setting
Polynomially competitive algorithm development
Efficient sampling for reduced label queries
Innovation

Methods, ideas, or system contributions that make the work stand out.

Polynomially competitive active logistic regression algorithm
Efficient sampling for reduced label complexity
Extendable to general function classes
🔎 Similar Papers
No similar papers found.