PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement

📅 2023-09-20
🏛️ IEEE transactions on circuits and systems for video technology (Print)
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing pedestrian search methods suffer from two key limitations: (1) misalignment between detection outputs and ReID requirements, and (2) suboptimal feature learning due to tight coupling and insufficient collaboration between detection and ReID subtasks. To address these, we propose the first diffusion-based dual-path collaborative denoising framework for pedestrian search, which abandons conventional detection priors and directly generates ground-truth bounding boxes and ReID embeddings jointly from noisy inputs. Our core innovation is the Collaborative Denoising Layer (CDL), enabling end-to-end mutual refinement of detection and ReID features throughout iterative denoising. The framework supports fully differentiable training and achieves state-of-the-art performance on standard benchmarks—including CUHK-SYSU and PRW—while using fewer parameters and offering controllable, elastic computational cost.
📝 Abstract
Dominant Person Search methods aim to localize and recognize query persons in a unified network, which jointly optimizes two sub-tasks, ie, pedestrian detection and Re-IDentification (ReID). Despite significant progress, current methods face two primary challenges: 1) the pedestrian candidates learned within detectors are suboptimal for the ReID task. 2) the potential for collaboration between two sub-tasks is overlooked. To address these issues, we present a novel Person Search framework based on the Diffusion model, PSDiff. PSDiff formulates the person search as a dual denoising process from noisy boxes and ReID embeddings to ground truths. Distinct from the conventional Detection-to-ReID approach, our denoising paradigm discards prior pedestrian candidates generated by detectors, thereby avoiding the local optimum problem of the ReID task. Following the new paradigm, we further design a new Collaborative Denoising Layer (CDL) to optimize detection and ReID sub-tasks in an iterative and collaborative way, which makes two sub-tasks mutually beneficial. Extensive experiments on the standard benchmarks show that PSDiff achieves state-of-the-art performance with fewer parameters and elastic computing overhead.
Problem

Research questions and friction points this paper is trying to address.

Pedestrian Detection
ReID (Re-Identification)
Synergy Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

PSDiff
diffusion model
CDL layer
🔎 Similar Papers
No similar papers found.