Dynamic Proxy Domain Generalizes the Crowd Localization by Better Binary Segmentation

๐Ÿ“… 2024-04-22
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing confidence-threshold learners for cross-domain crowd localization suffer from poor generalization under drastic variations in scene density, scale, and content across domains. Method: This paper proposes the Dynamic Proxy Domain (DPD) framework, which adaptively constructs a proxy domain grounded in the theoretical upper bound of binary-classification generalization error to bridge the gap to unseen target domains. DPD jointly optimizes pixel-level binary segmentation and domain-invariant feature learning via collaborative training of the threshold-based localizer. Contribution/Results: To our knowledge, this is the first work to deeply integrate generalization error analysis with proxy domain generation, significantly enhancing out-of-distribution robustness. Extensive experiments across five representative domain-shift scenarios demonstrate superior binary segmentation accuracy and head-point localization performance on unseen target domains. The source code is publicly available.

Technology Category

Application Category

๐Ÿ“ Abstract
Crowd localization targets on predicting each instance precise location within an image. Current advanced methods propose the pixel-wise binary classification to tackle the congested prediction, in which the pixel-level thresholds binarize the prediction confidence of being the pedestrian head. Since the crowd scenes suffer from extremely varying contents, counts and scales, the confidence-threshold learner is fragile and under-generalized encountering domain knowledge shift. Moreover, at the most time, the target domain is agnostic in training. Hence, it is imperative to exploit how to enhance the generalization of confidence-threshold locator to the latent target domain. In this paper, we propose a Dynamic Proxy Domain (DPD) method to generalize the learner under domain shift. Concretely, based on the theoretical analysis to the generalization error risk upper bound on the latent target domain to a binary classifier, we propose to introduce a generated proxy domain to facilitate generalization. Then, based on the theory, we design a DPD algorithm which is composed by a training paradigm and proxy domain generator to enhance the domain generalization of the confidence-threshold learner. Besides, we conduct our method on five kinds of domain shift scenarios, demonstrating the effectiveness on generalizing the crowd localization. Our code will be available at https://github.com/zhangda1018/DPD.
Problem

Research questions and friction points this paper is trying to address.

Enhance generalization of confidence-threshold locator for varying crowd scenes
Address domain knowledge shift in crowd localization predictions
Improve binary segmentation accuracy under diverse domain shift scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Proxy Domain enhances generalization
Proxy domain generator improves binary segmentation
Training paradigm adapts to domain shifts
๐Ÿ”Ž Similar Papers
No similar papers found.
J
Junyu Gao
School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xiโ€™an 710072, China and Key Laboratory of Intelligent Interaction and Applications, Ministry of Industry and Information Technology, Xiโ€™an 710072, P. R. China
D
Da Zhang
School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xiโ€™an 710072, China and Key Laboratory of Intelligent Interaction and Applications, Ministry of Industry and Information Technology, Xiโ€™an 710072, P. R. China
X
Xuelong Li
Institute of Artificial Intelligence (TeleAI), China Telecom Corp Ltd, 31 Jinrong Street, Beijing 100033, P. R. China