Near-Polynomially Competitive Active Logistic Regression

📅 2025-03-07

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This paper investigates label query efficiency for active logistic regression under the realizable setting, aiming to achieve near-optimal generalization performance with significantly fewer labeled examples than passive learning. We propose the first active learning algorithm for logistic regression that attains a near-polynomial competitive ratio—i.e., label complexity differing from the optimal only by a polylog(1/ε) factor—under arbitrary input distributions. Crucially, our method reduces the label complexity to logarithmic order O(log(1/ε)), surpassing prior approaches that incur polynomial dependence on 1/ε. The algorithm employs an adaptive, computationally efficient sampling strategy, generalizable to broader function classes. We provide tight theoretical analysis establishing the optimality of its label complexity. Empirical evaluation on standard logistic regression benchmarks demonstrates substantial improvements over state-of-the-art active learning baselines.

Technology Category

Application Category

📝 Abstract

We address the problem of active logistic regression in the realizable setting. It is well known that active learning can require exponentially fewer label queries compared to passive learning, in some cases using $log frac{1}{eps}$ rather than $poly(1/eps)$ labels to get error $eps$ larger than the optimum. We present the first algorithm that is polynomially competitive with the optimal algorithm on every input instance, up to factors polylogarithmic in the error and domain size. In particular, if any algorithm achieves label complexity polylogarithmic in $eps$, so does ours. Our algorithm is based on efficient sampling and can be extended to learn more general class of functions. We further support our theoretical results with experiments demonstrating performance gains for logistic regression compared to existing active learning algorithms.

Problem

Research questions and friction points this paper is trying to address.

Active logistic regression in realizable setting

Polynomially competitive algorithm development

Efficient sampling for reduced label queries

Innovation

Methods, ideas, or system contributions that make the work stand out.

Polynomially competitive active logistic regression algorithm

Efficient sampling for reduced label complexity

Extendable to general function classes

🔎 Similar Papers

No similar papers found.