Optimizing Active Learning in Vision-Language Models via Parameter-Efficient Uncertainty Calibration

📅 2025-07-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing two key challenges in active learning for large-scale vision-language models (VLMs)—inaccurate uncertainty estimation and low sampling efficiency—this paper proposes a parameter-efficient, differentiable uncertainty calibration framework. Methodologically, we introduce a novel uncertainty calibration loss function and jointly optimize prompt tuning and LoRA to enable lightweight, end-to-end uncertainty modeling—eliminating reliance on hand-crafted features and enabling high-confidence sample selection with only a few labeled examples. Extensive experiments across multiple vision-language benchmarks and diverse backbone architectures demonstrate that our approach significantly outperforms existing active learning strategies under limited annotation budgets: it maintains or improves accuracy while reducing training overhead by 30–50%. To the best of our knowledge, this is the first work to achieve high-accuracy, low-cost, fully differentiable, end-to-end active learning for large-scale VLMs.

Technology Category

Application Category

📝 Abstract
Active Learning (AL) has emerged as a powerful approach for minimizing labeling costs by selectively sampling the most informative data for neural network model development. Effective AL for large-scale vision-language models necessitates addressing challenges in uncertainty estimation and efficient sampling given the vast number of parameters involved. In this work, we introduce a novel parameter-efficient learning methodology that incorporates uncertainty calibration loss within the AL framework. We propose a differentiable loss function that promotes uncertainty calibration for effectively selecting fewer and most informative data samples for fine-tuning. Through extensive experiments across several datasets and vision backbones, we demonstrate that our solution can match and exceed the performance of complex feature-based sampling techniques while being computationally very efficient. Additionally, we investigate the efficacy of Prompt learning versus Low-rank adaptation (LoRA) in sample selection, providing a detailed comparative analysis of these methods in the context of efficient AL.
Problem

Research questions and friction points this paper is trying to address.

Optimizing active learning for vision-language models efficiently
Improving uncertainty calibration in large-scale model sampling
Comparing Prompt learning and LoRA for sample selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-efficient uncertainty calibration loss
Differentiable loss for informative sampling
Prompt learning versus LoRA comparison
🔎 Similar Papers
No similar papers found.
A
Athmanarayanan Lakshmi Narayanan
Intel Labs, Intel Corporation, Santa Clara, USA
A
Amrutha Machireddy
Intel Labs, Intel Corporation, Bangalore, IN
Ranganath Krishnan
Ranganath Krishnan
Research Scientist, Intel Labs
Machine LearningDeep LearningProbabilistic Machine LearningArtificial Intelligence