Dual-Prompt CLIP with Hybrid Visual Encoders for Occluded Person Re-Identification

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

This work addresses the challenge of person re-identification under occlusion, where missing body regions and the neglect of occlusion semantics hinder cross-view matching. To tackle this issue, the authors propose DPL-ReID, a novel framework that integrates a dual-prompt learning strategy—simultaneously modeling holistic pedestrian semantics and occlusion robustness—with Real-World Occlusion Augmentation (RWOA) and a Weighted Gated Feature Fusion (WGFF) mechanism. These components collectively guide the CLIP visual encoder to produce more comprehensive and robust feature representations. The proposed method achieves state-of-the-art performance across multiple occluded ReID benchmarks, significantly improving identification accuracy in occlusion-prone scenarios.

📝 Abstract

Occluded person re-identification focuses on matching partially visible pedestrians across multiple camera views. However, occlusions disrupt body-region cues, thereby complicating cross-view matching. Most person ReID methods built on pretrained vision-language models only focus on enhancing prompt-based feature learning while ignoring the semantic information of occluders. Based on the success of CLIP-ReID, we propose a novel Dual Prompt Learning ReID (DPL-ReID) model for occluded person ReID. It incorporates a Dual Prompt Learning (Dual-PL) strategy, which can utilize textual cues to capture complete pedestrian semantics and keep robustness against occlusion, and a Real-World Occlusion Augmentation (RWOA) method that realistically simulates occlusion scenarios encountered in real word to enrich occluded samples. In addition, we also design a Weighted Gated Feature Fusion (WGFF) method, which in corporates LSNet to capture global information and act as a feature-gating mechanism. This mechanism can effectively guide the CLIP visual encoder toward generating more comprehensive feature representations. Extensive experiments on several benchmark occluded ReID datasets show that our proposed DPL-ReID achieves the state-of-the art performance. The occlusion instance library are available at https://github.com/stone-qiao/DPL-ReID.

Problem

Research questions and friction points this paper is trying to address.

occluded person re-identification

occlusion

cross-view matching

pedestrian matching

visual cues

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual Prompt Learning

Real-World Occlusion Augmentation

Weighted Gated Feature Fusion