Robust Duality Learning for Unsupervised Visible-Infrared Person Re-Identification

๐Ÿ“… 2025-05-05
๐Ÿ›๏ธ IEEE Transactions on Information Forensics and Security
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Unsupervised visible-infrared person re-identification (UVI-ReID) confronts dual challenges: cross-modal heterogeneity and pseudo-label noiseโ€”including noise overfitting, error accumulation, and inter-cluster mismatch. To address these, we propose RoDE, a robust dual-learning framework that innovatively integrates adaptive robust anti-noise learning (RAL), alternating dual training, and cluster-consistent matching (CCM). RoDE explicitly models and suppresses pseudo-label noise via dynamic sample reweighting, alternating self-training between two complementary models, and similarity-driven cross-modal cluster alignment. Extensive experiments on SYSU-MM01, RegDB, and LLVIP benchmarks demonstrate state-of-the-art performance, with mAP improvements of up to 6.2% over prior methods. Ablation studies confirm the effectiveness of each component in enhancing noise robustness and cross-modal generalization.

Technology Category

Application Category

๐Ÿ“ Abstract
Unsupervised visible-infrared person re-identification (UVI-ReID) aims at retrieving pedestrian images of the same individual across distinct modalities, presenting challenges due to the inherent heterogeneity gap and the absence of cost-prohibitive annotations. Although existing methods employ self-training with clustering-generated pseudo-labels to bridge this gap, they always implicitly assume that these pseudo-labels are predicted correctly. In practice, however, this presumption is impossible to satisfy due to the difficulty of training a perfect model let alone without any ground truths, resulting in pseudo-labeling errors. Based on the observation, this study introduces a new learning paradigm for UVI-ReID considering Pseudo-Label Noise (PLN), which encompasses three challenges: noise overfitting, error accumulation, and noisy cluster correspondence. To conquer these challenges, we propose a novel robust duality learning framework (RoDE) for UVI-ReID to mitigate the adverse impact of noisy pseudo-labels. Specifically, for noise overfitting, we propose a novel Robust Adaptive Learning mechanism (RAL) to dynamically prioritize clean samples while deprioritizing noisy ones, thus avoiding overemphasizing noise. To circumvent error accumulation of self-training, where the model tends to confirm its mistakes, RoDE alternately trains dual distinct models using pseudo-labels predicted by their counterparts, thereby maintaining diversity and avoiding collapse into noise. However, this will lead to cross-cluster misalignment between the two distinct models, not to mention the misalignment between different modalities, resulting in dual noisy cluster correspondence and thus difficult to optimize. To address this issue, a Cluster Consistency Matching mechanism (CCM) is presented to ensure reliable alignment across distinct modalities as well as across different models by leveraging cross-cluster similarities. Extensive experiments on three benchmark datasets demonstrate the effectiveness of the proposed RoDE.
Problem

Research questions and friction points this paper is trying to address.

Addresses pseudo-label noise in unsupervised visible-infrared person re-identification
Mitigates noise overfitting and error accumulation in self-training models
Aligns clusters across modalities to resolve noisy cluster correspondence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Robust Adaptive Learning for clean sample emphasis
Dual-model training to prevent error accumulation
Cluster Consistency Matching for cross-cluster alignment
๐Ÿ”Ž Similar Papers
No similar papers found.
Yongxiang Li
Yongxiang Li
Professor, RMIT University
Electronic Materials and Devices
Y
Yuan Sun
College of Computer Science, Sichuan University, Chengdu 610065, China
Yang Qin
Yang Qin
College of Computer Science, Sichuan University, Chengdu 610065, China
Dezhong Peng
Dezhong Peng
Sichuan University
Multi-modal LearningMultimedia AnalysisNeural Network
X
Xi Peng
College of Computer Science, Sichuan University, Chengdu 610065, China
P
Peng Hu
College of Computer Science, Sichuan University, Chengdu 610065, China