Avoid Wasted Annotation Costs in Open-set Active Learning with Pre-trained Vision-Language Model

📅 2024-08-09

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

193K/year

🤖 AI Summary

In open-set active learning, unlabeled data often contain out-of-distribution (OOD) samples; blind annotation incurs substantial cost waste. Existing methods struggle to jointly optimize sample informativeness and in-distribution (ID) purity, and heavily rely on OOD labels or auxiliary training. This paper proposes CLIPNAL—a novel active learning framework that eliminates reliance on labeled OOD data for the first time. It leverages a pre-trained CLIP model to perform unsupervised OOD detection and filtering, then selects highly informative samples exclusively from the remaining ID pool. CLIPNAL jointly enhances purity and informativeness via semantic-vision alignment scoring and a two-stage selection strategy (purity-first followed by informativeness-based refinement), requiring neither OOD annotations nor additional model training. Experiments across diverse open-set settings demonstrate that CLIPNAL achieves state-of-the-art model performance at the lowest annotation cost, significantly outperforming existing approaches.

Technology Category

Application Category

📝 Abstract

Active learning (AL) aims to enhance model performance by selectively collecting highly informative data, thereby minimizing annotation costs. However, in practical scenarios, unlabeled data may contain out-of-distribution (OOD) samples, leading to wasted annotation costs if data is incorrectly selected. Recent research has explored methods to apply AL to open-set data, but these methods often require or incur unavoidable cost losses to minimize them. To address these challenges, we propose a novel selection strategy, CLIPN for AL (CLIPNAL), which minimizes cost losses without requiring OOD samples. CLIPNAL sequentially evaluates the purity and informativeness of data. First, it utilizes a pre-trained vision-language model to detect and exclude OOD data by leveraging linguistic and visual information of in-distribution (ID) data without additional training. Second, it selects highly informative data from the remaining ID data, and then the selected samples are annotated by human experts. Experimental results on datasets with various open-set conditions demonstrate that CLIPNAL achieves the lowest cost loss and highest performance across all scenarios. Code is available at https://github.com/DSBA-Lab/OpenAL.

Problem

Research questions and friction points this paper is trying to address.

Minimizes wasted annotation costs in open-set active learning

Balances informativeness and purity of unlabeled samples

Reduces dependence on out-of-distribution (OOD) samples

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained vision-language model for OOD detection

Sequentially evaluates purity and informativeness of data

Minimizes annotation costs by excluding OOD samples

🔎 Similar Papers

No similar papers found.