Pre-trained Vision-Language Models Assisted Noisy Partial Label Learning

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

Learning from noisy partial labels (NPLL) automatically generated by pre-trained vision-language models (VLMs) is challenging due to strong instance-dependent noise, which severely degrades learning reliability. Method: We propose Collaborative Consistency Regularization (Co-Reg), the first method explicitly modeling the instance-dependent nature of VLM-generated noise. Co-Reg employs a dual-network collaborative pseudo-labeling mechanism, jointly enforcing consistency constraints in both label and feature spaces, and incorporates minimal human-annotated guidance. It integrates collaborative training, multi-space regularization, and knowledge distillation—requiring no manual annotations for robust training. Results: Extensive experiments across diverse noise patterns, annotation strategies, and VLM backbones (e.g., CLIP, LLaVA) demonstrate that Co-Reg consistently outperforms state-of-the-art denoising and disambiguation methods. These results validate the effectiveness and generalizability of weakly supervised, VLM-driven knowledge distillation for NPLL.

Technology Category

Application Category

📝 Abstract

In the context of noisy partial label learning (NPLL), each training sample is associated with a set of candidate labels annotated by multiple noisy annotators. With the emergence of high-performance pre-trained vision-language models (VLMs) such as CLIP, LLaVa and GPT-4V, the direction of using these models to replace time-consuming manual annotation workflows and achieve"manual-annotation-free"training for downstream tasks has become a highly promising research avenue. This paper focuses on learning from noisy partial labels annotated by pre-trained VLMs and proposes an innovative collaborative consistency regularization (Co-Reg) method. Unlike the symmetric noise primarily addressed in traditional noisy label learning, the noise generated by pre-trained models is instance-dependent, embodying the underlying patterns of the pre-trained models themselves, which significantly increases the learning difficulty for the model. To address this, we simultaneously train two neural networks that implement collaborative purification of training labels through a"Co-Pseudo-Labeling"mechanism, while enforcing consistency regularization constraints in both the label space and feature representation space. Our method can also leverage few-shot manually annotated valid labels to further enhance its performances. Comparative experiments with different denoising and disambiguation algorithms, annotation manners, and pre-trained model application schemes fully validate the effectiveness of the proposed method, while revealing the broad prospects of integrating weakly-supervised learning techniques into the knowledge distillation process of pre-trained models.

Problem

Research questions and friction points this paper is trying to address.

Addressing noisy partial label learning with pre-trained VLMs

Handling instance-dependent noise from pre-trained model annotations

Improving label purification via collaborative consistency regularization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained VLMs for noisy label annotation

Proposes Co-Reg method for label purification

Enforces consistency in label and feature spaces

🔎 Similar Papers

No similar papers found.