🤖 AI Summary
Conventional active learning suffers from a “cold-start” problem in few-shot settings, requiring substantial initial labeled data. Method: We propose a zero-shot active learning framework leveraging large language models (LLMs)—including GPT-4, o1, and Llama 3—as off-the-shelf, un-fine-tuned query engines. It performs zero-shot or few-shot uncertainty estimation and candidate sample ranking to select high-informativeness instances. The framework is model-agnostic, supporting diverse downstream models (e.g., BERT, ADAPET, PERFECT, SetFit) under both few-shot and iterative non-few-shot settings, and can augment traditional methods to alleviate cold-start issues. Contribution/Results: Experiments demonstrate significant accuracy gains for BERT on few-shot text classification tasks, consistently outperforming established baselines. The approach exhibits strong generalization and stability across varying data scales, establishing a new paradigm for label-efficient learning without reliance on initial annotations.
📝 Abstract
Active learning is designed to minimize annotation efforts by prioritizing instances that most enhance learning. However, many active learning strategies struggle with a `cold-start' problem, needing substantial initial data to be effective. This limitation reduces their utility in the increasingly relevant few-shot scenarios, where the instance selection has a substantial impact. To address this, we introduce ActiveLLM, a novel active learning approach that leverages Large Language Models such as GPT-4, o1, Llama 3, or Mistral Large for selecting instances. We demonstrate that ActiveLLM significantly enhances the classification performance of BERT classifiers in few-shot scenarios, outperforming traditional active learning methods as well as improving the few-shot learning methods ADAPET, PERFECT, and SetFit. Additionally, ActiveLLM can be extended to non-few-shot scenarios, allowing for iterative selections. In this way, ActiveLLM can even help other active learning strategies to overcome their cold-start problem. Our results suggest that ActiveLLM offers a promising solution for improving model performance across various learning setups.