Revisiting Unknowns: Towards Effective and Efficient Open-Set Active Learning

πŸ“… 2026-03-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of identifying and leveraging unseen-class samples in open-set active learning without relying on an additional out-of-distribution detector. The proposed EΒ²OAL framework employs label-guided clustering to uncover the structure of unknown classes and integrates a Dirichlet-calibrated auxiliary head to jointly model known and unknown categories. It introduces a two-stage adaptive query strategy that innovatively repurposes labeled unknown samples to enhance supervision for known classes. Key technical contributions include a structure-aware F1-product optimization objective, a logit-margin purity score, and a novel informativeness measure tailored for open-set active learning. Extensive experiments demonstrate that EΒ²OAL consistently outperforms existing methods across multiple benchmarks in terms of accuracy, query precision, and computational efficiency, underscoring its practical utility.

Technology Category

Application Category

πŸ“ Abstract
Open-set active learning (OSAL) aims to identify informative samples for annotation when unlabeled data may contain previously unseen classes-a common challenge in safety-critical and open-world scenarios. Existing approaches typically rely on separately trained open-set detectors, introducing substantial training overhead and overlooking the supervisory value of labeled unknowns for improving known-class learning. In this paper, we propose E$^2$OAL (Effective and Efficient Open-set Active Learning), a unified and detector-free framework that fully exploits labeled unknowns for both stronger supervision and more reliable querying. E$^2$OAL first uncovers the latent class structure of unknowns through label-guided clustering in a frozen contrastively pre-trained feature space, optimized by a structure-aware F1-product objective. To leverage labeled unknowns, it employs a Dirichlet-calibrated auxiliary head that jointly models known and unknown categories, improving both confidence calibration and known-class discrimination. Building on this, a logit-margin purity score estimates the likelihood of known classes to construct a high-purity candidate pool, while an OSAL-specific informativeness metric prioritizes partially ambiguous yet reliable samples. These components together form a flexible two-stage query strategy with adaptive precision control and minimal hyperparameter sensitivity. Extensive experiments across multiple OSAL benchmarks demonstrate that E$^2$OAL consistently surpasses state-of-the-art methods in accuracy, efficiency, and query precision, highlighting its effectiveness and practicality for real-world applications. The code is available at github.com/chenchenzong/E2OAL.
Problem

Research questions and friction points this paper is trying to address.

open-set active learning
unknown classes
sample selection
annotation efficiency
supervision from labeled unknowns
Innovation

Methods, ideas, or system contributions that make the work stand out.

open-set active learning
label-guided clustering
Dirichlet calibration
logit-margin purity
two-stage query strategy
πŸ”Ž Similar Papers
Chen-Chen Zong
Chen-Chen Zong
Nanjing University of Aeronautics & Astronautics
machine learning
Y
Yu-Qi Chi
Nanjing University of Aeronautics and Astronautics
X
Xie-Yang Wang
Nanjing University of Aeronautics and Astronautics
Y
Yan Cui
Nanjing University of Aeronautics and Astronautics
Sheng-Jun Huang
Sheng-Jun Huang
Nanjing University of Aeronautics & Astronautics
Machine Learning