Balancing Uncertainty and Diversity of Samples: Leveraging Diversity of Least, High Confidence Samples for Effective Active Learning

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This work addresses the limitations of existing active learning approaches, which often rely on a single criterion—such as uncertainty or diversity—and thus struggle to balance sample representativeness and informativeness. To overcome this, the authors propose four hybrid sampling strategies that jointly leverage uncertainty and distributional diversity. Among them, the Least Confident and Diverse (LCD) method innovatively exploits the diversity of both high- and low-confidence samples, transcending the constraints of traditional single-dimensional selection. Integrating deep models such as CNNs and Vision Transformers with confidence estimation and diversity metrics, LCD enables efficient and effective sample acquisition. Extensive experiments demonstrate that LCD consistently outperforms state-of-the-art methods across multiple tasks, significantly enhancing model performance and annotation efficiency.

📝 Abstract

Deep learning models, including Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have achieved state-of-the-art performance on various computer vision tasks such as object classification, detection, segmentation, generation, and many more. However, these models are data-hungry as they require more training data to learn millions or billions of parameters. Especially for supervised learning tasks, curating a large number of labeled samples for model training is an expensive and time-consuming task. Active Learning (AL) has been used to address this problem for many years. Existing active learning methods aim at choosing the samples for annotation from a pool of unlabeled samples that are either diverse or uncertain. Choosing such samples may hinder the model's performance as we pool based on one dimension, i.e., either diverse or uncertain. In this paper, we propose four novel hybrid sampling methods for pooling both easy and hard samples, which are also diverse. To verify the efficacy of the proposed methods, extensive experiments are conducted using high and low-confidence samples separately. We observe from our experiments that the proposed hybrid sampling method, Least Confident and Diverse (LCD), consistently performs better compared to state-of-the-art methods. It is observed that selecting uncertain and diverse instances helps the model learn more distinct features. The codes related to this study will be available at https://github.com/XXX/LCD.

Problem

Research questions and friction points this paper is trying to address.

Active Learning

Uncertainty

Diversity

Sample Selection

Deep Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Active Learning

Sample Diversity

Uncertainty Sampling