🤖 AI Summary
To address the degradation of person re-identification (ReID) performance across spatiotemporal scenarios caused by appearance variations—such as clothing and hairstyle changes—this paper proposes the first purely skeleton-driven, clothing-variation-robust ReID method. Departing from conventional appearance-based approaches, our method models identity exclusively via temporal human skeleton sequences. We design a temporal graph convolutional encoder to learn discriminative dynamic skeletal representations and introduce a multi-segment prediction fusion mechanism to enhance temporal robustness. By integrating diverse pose estimation algorithms and adopting video-segment aggregation for inference, our approach achieves state-of-the-art performance on the CCVID dataset, significantly outperforming appearance-based methods. This demonstrates the strong generalizability and invariance of human motion dynamics for identity discrimination, even under substantial appearance changes.
📝 Abstract
Clothes-Changing Person Re-Identification (ReID) aims to recognize the same individual across different videos captured at various times and locations. This task is particularly challenging due to changes in appearance, such as clothing, hairstyle, and accessories. We propose a Clothes-Changing ReID method that uses only skeleton data and does not use appearance features. Traditional ReID methods often depend on appearance features, leading to decreased accuracy when clothing changes. Our approach utilizes a spatio-temporal Graph Convolution Network (GCN) encoder to generate a skeleton-based descriptor for each individual. During testing, we improve accuracy by aggregating predictions from multiple segments of a video clip. Evaluated on the CCVID dataset with several different pose estimation models, our method achieves state-of-the-art performance, offering a robust and efficient solution for Clothes-Changing ReID.