SynMVCrowd: A Large Synthetic Benchmark for Multi-view Crowd Counting and Localization

📅 2026-03-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing methods are predominantly evaluated on small-scale, limited-viewpoint, and sparsely populated scenes, which fail to reflect performance in real-world complex environments. To address this gap, this work introduces SynMVCrowd, the first synthetic benchmark tailored for large-scale multi-view crowd analysis, comprising 50 high-fidelity 3D scenes with multi-view imagery and densely packed crowds of up to 1,000 individuals. Leveraging this benchmark, we develop a strong baseline model that effectively fuses multi-view information for accurate localization. Experiments demonstrate that our approach not only outperforms existing methods on SynMVCrowd but also significantly enhances domain transferability to novel real-world scenes and improves single-image counting accuracy.

Technology Category

Application Category

📝 Abstract

Existing multi-view crowd counting and localization methods are evaluated under relatively small scenes with limited crowd numbers, camera views, and frames. This makes the evaluation and comparison of existing methods impractical, as small datasets are easily overfit by these methods. To avoid these issues, 3DROM proposes a data augmentation method. Instead, in this paper, we propose a large synthetic benchmark, SynMVCrowd, for more practical evaluation and comparison of multi-view crowd counting and localization tasks. The SynMVCrowd benchmark consists of 50 synthetic scenes with a large number of multi-view frames and camera views and a much larger crowd number (up to 1000), which is more suitable for large-scene multi-view crowd vision tasks. Besides, we propose strong multi-view crowd localization and counting baselines that outperform all comparison methods on the new SynMVCrowd benchmark. Moreover, we prove that better domain transferring multi-view and single-image counting performance could be achieved with the aid of the benchmark on novel new real scenes. As a result, the proposed benchmark could advance the research for multi-view and single-image crowd counting and localization to more practical applications. The codes and datasets are here: https://github.com/zqyq/SynMVCrowd.

Problem

Research questions and friction points this paper is trying to address.

multi-view crowd counting

crowd localization

synthetic benchmark

large-scale scene

dataset limitation

Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic benchmark

multi-view crowd counting

crowd localization

domain transfer

large-scale dataset

🔎 Similar Papers

No similar papers found.

Authors to Follow