Try Harder: Hard Sample Generation and Learning for Clothes-Changing Person Re-ID

📅 2025-07-15

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In clothing-changing person re-identification (CC-ReID), the absence of a well-defined and controllable hard sample generation mechanism hinders model robustness and discriminative capability. To address this, we propose the first text–vision multimodal collaborative framework for hard sample generation and learning. Our method employs multimodal prompting to guide a generative network and introduces a dual-granularity hard sample generation (DGHSG) mechanism that explicitly models cross-clothing similarity. Furthermore, we design hardness-aware self-adaptive learning (HSAL) and semantic-label-driven distance adjustment to optimize embedding space distribution. Evaluated on PRCC and LTCC benchmarks, our approach achieves state-of-the-art performance, significantly accelerates convergence, and simultaneously improves generalization and identification accuracy. This work establishes a novel, interpretable, and controllable paradigm for hard sample learning in CC-ReID.

Technology Category

Application Category

📝 Abstract

Hard samples pose a significant challenge in person re-identification (ReID) tasks, particularly in clothing-changing person Re-ID (CC-ReID). Their inherent ambiguity or similarity, coupled with the lack of explicit definitions, makes them a fundamental bottleneck. These issues not only limit the design of targeted learning strategies but also diminish the model's robustness under clothing or viewpoint changes. In this paper, we propose a novel multimodal-guided Hard Sample Generation and Learning (HSGL) framework, which is the first effort to unify textual and visual modalities to explicitly define, generate, and optimize hard samples within a unified paradigm. HSGL comprises two core components: (1) Dual-Granularity Hard Sample Generation (DGHSG), which leverages multimodal cues to synthesize semantically consistent samples, including both coarse- and fine-grained hard positives and negatives for effectively increasing the hardness and diversity of the training data. (2) Hard Sample Adaptive Learning (HSAL), which introduces a hardness-aware optimization strategy that adjusts feature distances based on textual semantic labels, encouraging the separation of hard positives and drawing hard negatives closer in the embedding space to enhance the model's discriminative capability and robustness to hard samples. Extensive experiments on multiple CC-ReID benchmarks demonstrate the effectiveness of our approach and highlight the potential of multimodal-guided hard sample generation and learning for robust CC-ReID. Notably, HSAL significantly accelerates the convergence of the targeted learning procedure and achieves state-of-the-art performance on both PRCC and LTCC datasets. The code is available at https://github.com/undooo/TryHarder-ACMMM25.

Problem

Research questions and friction points this paper is trying to address.

Defining and generating hard samples for clothing-changing person Re-ID

Improving model robustness under clothing or viewpoint changes

Unifying textual and visual modalities for hard sample optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal-guided hard sample generation framework

Dual-granularity hard sample synthesis method

Hardness-aware adaptive learning optimization strategy

🔎 Similar Papers

No similar papers found.

Authors to Follow