🤖 AI Summary
This work addresses the lack of quantitative metrics suitable for evaluating active learning across multiple iterative rounds. The authors propose the “speed-up factor,” a novel performance measure that is formally defined and empirically validated to be stable and interpretable. It quantifies the proportion of labeled samples an active learning strategy requires—relative to random sampling—to achieve equivalent model performance over successive acquisition rounds. Through extensive empirical analysis across four diverse domain datasets and seven query strategies, the study demonstrates that the speed-up factor reliably captures sample efficiency and exhibits superior stability and robustness compared to existing evaluation metrics across varying numbers of iterations, thereby effectively addressing a critical gap in current active learning assessment frameworks.
📝 Abstract
Machine learning models excel with abundant annotated data, but annotation is often costly and time-intensive. Active learning (AL) aims to improve the performance-to-annotation ratio by using query methods (QMs) to iteratively select the most informative samples. While AL research focuses mainly on QM development, the evaluation of this iterative process lacks appropriate performance metrics. This work reviews eight years of AL evaluation literature and formally introduces the speed-up factor, a quantitative multi-iteration QM performance metric that indicates the fraction of samples needed to match random sampling performance. Using four datasets from diverse domains and seven QMs of various types, we empirically evaluate the speed-up factor and compare it with state-of-the-art AL performance metrics. The results confirm the assumptions underlying the speed-up factor, demonstrate its accuracy in capturing the described fraction, and reveal its superior stability across iterations.