Active Learning via Regression Beyond Realizability

📅 2025-05-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing surrogate-risk-based active learning methods rely on the strong realizability assumption—that the optimal surrogate predictor lies within the hypothesis class—limiting practical applicability. Method: This paper proposes, for the first time in multi-class classification, a label-efficient active learning framework under a weak setting where the hypothesis class is convex but not necessarily realizable. It introduces a multi-round iterative fitting and model aggregation paradigm, overcoming the failure of conventional surrogate methods under misspecification; within an epoch-based framework, it achieves surrogate risk minimization with near-optimal label and sample complexity guarantees. Contribution/Results: Theoretical analysis shows its convergence rate matches that of optimal realizability-based methods, while a counterexample demonstrates its robustness to model misspecification—establishing a strict advantage over prior surrogate-risk approaches.

Technology Category

Application Category

📝 Abstract
We present a new active learning framework for multiclass classification based on surrogate risk minimization that operates beyond the standard realizability assumption. Existing surrogate-based active learning algorithms crucially rely on realizability$unicode{x2014}$the assumption that the optimal surrogate predictor lies within the model class$unicode{x2014}$limiting their applicability in practical, misspecified settings. In this work we show that under conditions significantly weaker than realizability, as long as the class of models considered is convex, one can still obtain a label and sample complexity comparable to prior work. Despite achieving similar rates, the algorithmic approaches from prior works can be shown to fail in non-realizable settings where our assumption is satisfied. Our epoch-based active learning algorithm departs from prior methods by fitting a model from the full class to the queried data in each epoch and returning an improper classifier obtained by aggregating these models.
Problem

Research questions and friction points this paper is trying to address.

Active learning beyond realizability assumption
Surrogate risk minimization for multiclass classification
Convex model class with weaker realizability conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Active learning beyond realizability assumption
Convex model class for surrogate risk minimization
Epoch-based algorithm with improper classifier aggregation
🔎 Similar Papers
No similar papers found.