🤖 AI Summary
This work addresses the performance bottlenecks in person re-identification caused by appearance variations, domain shift, and scarce annotations by proposing novel approaches under three settings: supervised, unsupervised domain adaptation, and fully unsupervised learning. Specifically, it introduces a supervised contrastive learning framework with a hybrid loss, a GAN-enhanced pseudo-label refinement strategy, and a Vision Transformer-based method incorporating camera-aware proxy learning. Key innovations include camera identity constraints, domain-invariant feature mapping, and a multi-granularity contrastive mechanism, which collectively enhance feature discriminability and cross-domain generalization. The proposed methods achieve state-of-the-art performance on standard benchmarks such as Market-1501 and CUHK03, with improvements of up to 12% in mAP and Rank-1 accuracy on cross-domain tasks.
📝 Abstract
Person re-identification (ReID) plays a critical role in intelligent surveillance systems by linking identities across multiple cameras in complex environments. However, ReID faces significant challenges such as appearance variations, domain shifts, and limited labeled data. This dissertation proposes three advanced approaches to enhance ReID performance under supervised, unsupervised domain adaptation (UDA), and fully unsupervised settings. First, SCM-ReID integrates supervised contrastive learning with hybrid loss optimization (classification, center, triplet, and centroid-triplet losses), improving discriminative feature representation and achieving state-of-the-art accuracy on Market-1501 and CUHK03 datasets. Second, for UDA, IQAGA and DAPRH combine GAN-based image augmentation, domain-invariant mapping, and pseudo-label refinement to mitigate domain discrepancies and enhance cross-domain generalization. Experiments demonstrate substantial gains over baseline methods, with mAP and Rank-1 improvements up to 12% in challenging transfer scenarios. Finally, ViTC-UReID leverages Vision Transformer-based feature encoding and camera-aware proxy learning to boost unsupervised ReID. By integrating global and local attention with camera identity constraints, this method significantly outperforms existing unsupervised approaches on large-scale benchmarks. Comprehensive evaluations across CUHK03, Market-1501, DukeMTMC-reID, and MSMT17 confirm the effectiveness of the proposed methods. The contributions advance ReID research by addressing key limitations in feature learning, domain adaptation, and label noise handling, paving the way for robust deployment in real-world surveillance systems.