Enhancing Sample Utilization in Noise-Robust Deep Metric Learning With Subgroup-Based Positive-Pair Selection

📅 2024-10-22
🏛️ IEEE Transactions on Image Processing
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In deep metric learning (DML), label noise compromises the reliability of positive pairs; existing robust methods typically discard suspicious samples, leading to data inefficiency and limited robustness. To address this, we propose Subgroup-Guided Positive Sample Selection (SGPS), the first framework for noise-robust DML that actively exploits noisy data by uncovering latent subgroup structures and aggregating positive prototypes. SGPS comprises four key components: (i) probability-driven clean sample screening, (ii) subgroup generation via clustering in embedding space, (iii) prototype aggregation within subgroups, and (iv) a customized contrastive loss that leverages subgroup-aware positive pairs. Extensive experiments on multiple synthetic and large-scale real-world noisy benchmarks demonstrate that SGPS consistently outperforms state-of-the-art noise-robust DML methods. It achieves superior accuracy and generalization in image retrieval and face recognition tasks, validating its effectiveness in harnessing noisy supervision without sacrificing performance.

Technology Category

Application Category

📝 Abstract
The existence of noisy labels in real-world data negatively impacts the performance of deep learning models. Although much research effort has been devoted to improving the robustness towards noisy labels in classification tasks, the problem of noisy labels in deep metric learning (DML) remains under-explored. Existing noisy label learning methods designed for DML mainly discard suspicious noisy samples, resulting in a waste of the training data. To address this issue, we propose a noise-robust DML framework with SubGroup-based Positive-pair Selection (SGPS), which constructs reliable positive pairs for noisy samples to enhance the sample utilization. Specifically, SGPS first effectively identifies clean and noisy samples by a probability-based clean sample selectionstrategy. To further utilize the remaining noisy samples, we discover their potential similar samples based on the subgroup information given by a subgroup generation module and then aggregate them into informative positive prototypes for each noisy sample via a positive prototype generation module. Afterward, a new contrastive loss is tailored for the noisy samples with their selected positive pairs. SGPS can be easily integrated into the training process of existing pair-wise DML tasks, like image retrieval and face recognition. Extensive experiments on multiple synthetic and real-world large-scale label noise datasets demonstrate the effectiveness of our proposed method. Without any bells and whistles, our SGPS framework outperforms the state-of-the-art noisy label DML methods.
Problem

Research questions and friction points this paper is trying to address.

Deep Metric Learning
Noisy Labels
Data Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Subgroup-based Positive Pair Selection
Noise Robust Deep Metric Learning
Contrastive Loss Customization
🔎 Similar Papers
No similar papers found.
Z
Zhipeng Yu
School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
Q
Qianqian Xu
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
Yangbangyan Jiang
Yangbangyan Jiang
University of Chinese Academy of Sciences
Machine LearningDeep learning
Y
Yingfei Sun
School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
Qingming Huang
Qingming Huang
University of the Chinese Academy of Sciences
Multimedia Analysis and RetrievalImage and Video ProcessingPattern RecognitionComputer VisionVideo Coding