Dynamic Uncertainty Learning with Noisy Correspondence for Text-Based Person Search

📅 2025-05-10

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

Text-to-image person search commonly relies on web-crawled image-text pairs for dataset construction, yet these pairs often suffer from severe semantic misalignment noise, significantly degrading retrieval performance. To address this, we propose the Dynamic Uncertainty and Relation Alignment (DURA) framework. DURA is the first to model cross-modal similarity evidence as a Dirichlet distribution, explicitly capturing matching uncertainty. It incorporates a Key Feature Selector for fine-grained feature selection and introduces a Dynamic Softmax Hinge Loss that jointly optimizes dynamic hard-negative weighting and bidirectional cross-modal alignment. Evaluated on three benchmark datasets, DURA achieves state-of-the-art retrieval accuracy under both low- and high-noise conditions, demonstrating substantial improvements in model robustness and noise resilience.

Technology Category

Application Category

📝 Abstract

Text-to-image person search aims to identify an individual based on a text description. To reduce data collection costs, large-scale text-image datasets are created from co-occurrence pairs found online. However, this can introduce noise, particularly mismatched pairs, which degrade retrieval performance. Existing methods often focus on negative samples, amplifying this noise. To address these issues, we propose the Dynamic Uncertainty and Relational Alignment (DURA) framework, which includes the Key Feature Selector (KFS) and a new loss function, Dynamic Softmax Hinge Loss (DSH-Loss). KFS captures and models noise uncertainty, improving retrieval reliability. The bidirectional evidence from cross-modal similarity is modeled as a Dirichlet distribution, enhancing adaptability to noisy data. DSH adjusts the difficulty of negative samples to improve robustness in noisy environments. Our experiments on three datasets show that the method offers strong noise resistance and improves retrieval performance in both low- and high-noise scenarios.

Problem

Research questions and friction points this paper is trying to address.

Address noisy text-image pairs in person search

Improve retrieval reliability with noise uncertainty modeling

Enhance robustness in noisy environments dynamically

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Uncertainty and Relational Alignment (DURA) framework

Key Feature Selector (KFS) models noise uncertainty

Dynamic Softmax Hinge Loss (DSH) adjusts sample difficulty

🔎 Similar Papers

PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement