ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios

📅 2024-05-17

🏛️ arXiv.org

📈 Citations: 7

✨ Influential: 1

career value

158K/year

🤖 AI Summary

Conventional active learning suffers from a “cold-start” problem in few-shot settings, requiring substantial initial labeled data. Method: We propose a zero-shot active learning framework leveraging large language models (LLMs)—including GPT-4, o1, and Llama 3—as off-the-shelf, un-fine-tuned query engines. It performs zero-shot or few-shot uncertainty estimation and candidate sample ranking to select high-informativeness instances. The framework is model-agnostic, supporting diverse downstream models (e.g., BERT, ADAPET, PERFECT, SetFit) under both few-shot and iterative non-few-shot settings, and can augment traditional methods to alleviate cold-start issues. Contribution/Results: Experiments demonstrate significant accuracy gains for BERT on few-shot text classification tasks, consistently outperforming established baselines. The approach exhibits strong generalization and stability across varying data scales, establishing a new paradigm for label-efficient learning without reliance on initial annotations.

Technology Category

Application Category

📝 Abstract

Active learning is designed to minimize annotation efforts by prioritizing instances that most enhance learning. However, many active learning strategies struggle with a `cold-start' problem, needing substantial initial data to be effective. This limitation reduces their utility in the increasingly relevant few-shot scenarios, where the instance selection has a substantial impact. To address this, we introduce ActiveLLM, a novel active learning approach that leverages Large Language Models such as GPT-4, o1, Llama 3, or Mistral Large for selecting instances. We demonstrate that ActiveLLM significantly enhances the classification performance of BERT classifiers in few-shot scenarios, outperforming traditional active learning methods as well as improving the few-shot learning methods ADAPET, PERFECT, and SetFit. Additionally, ActiveLLM can be extended to non-few-shot scenarios, allowing for iterative selections. In this way, ActiveLLM can even help other active learning strategies to overcome their cold-start problem. Our results suggest that ActiveLLM offers a promising solution for improving model performance across various learning setups.

Problem

Research questions and friction points this paper is trying to address.

Addresses cold-start issue in active learning for few-shot scenarios

Leverages LLMs to enhance BERT classifier performance with minimal data

Extends applicability to non-few-shot scenarios and aids traditional methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages LLMs for instance selection

Enhances BERT in few-shot scenarios

Overcomes cold-start problem effectively

🔎 Similar Papers

No similar papers found.