Balancing Uncertainty and Diversity of Samples: Leveraging Diversity of Least, High Confidence Samples for Effective Active Learning

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

220K/year
🤖 AI Summary
This work addresses the limitations of existing active learning approaches, which often rely on a single criterion—such as uncertainty or diversity—and thus struggle to balance sample representativeness and informativeness. To overcome this, the authors propose four hybrid sampling strategies that jointly leverage uncertainty and distributional diversity. Among them, the Least Confident and Diverse (LCD) method innovatively exploits the diversity of both high- and low-confidence samples, transcending the constraints of traditional single-dimensional selection. Integrating deep models such as CNNs and Vision Transformers with confidence estimation and diversity metrics, LCD enables efficient and effective sample acquisition. Extensive experiments demonstrate that LCD consistently outperforms state-of-the-art methods across multiple tasks, significantly enhancing model performance and annotation efficiency.
📝 Abstract
Deep learning models, including Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have achieved state-of-the-art performance on various computer vision tasks such as object classification, detection, segmentation, generation, and many more. However, these models are data-hungry as they require more training data to learn millions or billions of parameters. Especially for supervised learning tasks, curating a large number of labeled samples for model training is an expensive and time-consuming task. Active Learning (AL) has been used to address this problem for many years. Existing active learning methods aim at choosing the samples for annotation from a pool of unlabeled samples that are either diverse or uncertain. Choosing such samples may hinder the model's performance as we pool based on one dimension, i.e., either diverse or uncertain. In this paper, we propose four novel hybrid sampling methods for pooling both easy and hard samples, which are also diverse. To verify the efficacy of the proposed methods, extensive experiments are conducted using high and low-confidence samples separately. We observe from our experiments that the proposed hybrid sampling method, Least Confident and Diverse (LCD), consistently performs better compared to state-of-the-art methods. It is observed that selecting uncertain and diverse instances helps the model learn more distinct features. The codes related to this study will be available at https://github.com/XXX/LCD.
Problem

Research questions and friction points this paper is trying to address.

Active Learning
Uncertainty
Diversity
Sample Selection
Deep Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Active Learning
Sample Diversity
Uncertainty Sampling
Hybrid Sampling
Least Confident and Diverse
V
Vipul Arya
School of Computer Science and Engineering, RV University, Bengaluru, Karnataka-560059, India
S
S. H. Shabbeer Basha
School of Engineering & Technology, Vidyashilp University, Bangalore, India
S
Srikrishna U N
School of Computer Science and Engineering, RV University, Bengaluru, Karnataka-560059, India
S
Sunainha Vijay
School of Computer Science and Engineering, RV University, Bengaluru, Karnataka-560059, India
Snehasis Mukherjee
Snehasis Mukherjee
Shiv Nadar Institution of Eminence, Delhi NCR
Computer VisionImage/ Video ProcessingMachine Learning and Graphics