Avoid Wasted Annotation Costs in Open-set Active Learning with Pre-trained Vision-Language Model

πŸ“… 2024-08-09
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In open-set active learning, unlabeled data often contain out-of-distribution (OOD) samples; blind annotation incurs substantial cost waste. Existing methods struggle to jointly optimize sample informativeness and in-distribution (ID) purity, and heavily rely on OOD labels or auxiliary training. This paper proposes CLIPNALβ€”a novel active learning framework that eliminates reliance on labeled OOD data for the first time. It leverages a pre-trained CLIP model to perform unsupervised OOD detection and filtering, then selects highly informative samples exclusively from the remaining ID pool. CLIPNAL jointly enhances purity and informativeness via semantic-vision alignment scoring and a two-stage selection strategy (purity-first followed by informativeness-based refinement), requiring neither OOD annotations nor additional model training. Experiments across diverse open-set settings demonstrate that CLIPNAL achieves state-of-the-art model performance at the lowest annotation cost, significantly outperforming existing approaches.

Technology Category

Application Category

πŸ“ Abstract
Active learning (AL) aims to enhance model performance by selectively collecting highly informative data, thereby minimizing annotation costs. However, in practical scenarios, unlabeled data may contain out-of-distribution (OOD) samples, leading to wasted annotation costs if data is incorrectly selected. Recent research has explored methods to apply AL to open-set data, but these methods often require or incur unavoidable cost losses to minimize them. To address these challenges, we propose a novel selection strategy, CLIPN for AL (CLIPNAL), which minimizes cost losses without requiring OOD samples. CLIPNAL sequentially evaluates the purity and informativeness of data. First, it utilizes a pre-trained vision-language model to detect and exclude OOD data by leveraging linguistic and visual information of in-distribution (ID) data without additional training. Second, it selects highly informative data from the remaining ID data, and then the selected samples are annotated by human experts. Experimental results on datasets with various open-set conditions demonstrate that CLIPNAL achieves the lowest cost loss and highest performance across all scenarios. Code is available at https://github.com/DSBA-Lab/OpenAL.
Problem

Research questions and friction points this paper is trying to address.

Minimizes wasted annotation costs in open-set active learning
Balances informativeness and purity of unlabeled samples
Reduces dependence on out-of-distribution (OOD) samples
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained vision-language model for OOD detection
Sequentially evaluates purity and informativeness of data
Minimizes annotation costs by excluding OOD samples
πŸ”Ž Similar Papers
No similar papers found.
J
Jaehyuk Heo
School of Industrial & Management Engineering, Korea University, Seoul, South Korea
P
Pilsung Kang
School of Industrial & Management Engineering, Korea University, Seoul, South Korea