SynMVCrowd: A Large Synthetic Benchmark for Multi-view Crowd Counting and Localization

📅 2026-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods are predominantly evaluated on small-scale, limited-viewpoint, and sparsely populated scenes, which fail to reflect performance in real-world complex environments. To address this gap, this work introduces SynMVCrowd, the first synthetic benchmark tailored for large-scale multi-view crowd analysis, comprising 50 high-fidelity 3D scenes with multi-view imagery and densely packed crowds of up to 1,000 individuals. Leveraging this benchmark, we develop a strong baseline model that effectively fuses multi-view information for accurate localization. Experiments demonstrate that our approach not only outperforms existing methods on SynMVCrowd but also significantly enhances domain transferability to novel real-world scenes and improves single-image counting accuracy.

Technology Category

Application Category

📝 Abstract
Existing multi-view crowd counting and localization methods are evaluated under relatively small scenes with limited crowd numbers, camera views, and frames. This makes the evaluation and comparison of existing methods impractical, as small datasets are easily overfit by these methods. To avoid these issues, 3DROM proposes a data augmentation method. Instead, in this paper, we propose a large synthetic benchmark, SynMVCrowd, for more practical evaluation and comparison of multi-view crowd counting and localization tasks. The SynMVCrowd benchmark consists of 50 synthetic scenes with a large number of multi-view frames and camera views and a much larger crowd number (up to 1000), which is more suitable for large-scene multi-view crowd vision tasks. Besides, we propose strong multi-view crowd localization and counting baselines that outperform all comparison methods on the new SynMVCrowd benchmark. Moreover, we prove that better domain transferring multi-view and single-image counting performance could be achieved with the aid of the benchmark on novel new real scenes. As a result, the proposed benchmark could advance the research for multi-view and single-image crowd counting and localization to more practical applications. The codes and datasets are here: https://github.com/zqyq/SynMVCrowd.
Problem

Research questions and friction points this paper is trying to address.

multi-view crowd counting
crowd localization
synthetic benchmark
large-scale scene
dataset limitation
Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic benchmark
multi-view crowd counting
crowd localization
domain transfer
large-scale dataset
🔎 Similar Papers
No similar papers found.
Q
Qi Zhang
Visual Computing Research Center, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
D
Daijie Chen
Visual Computing Research Center, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Y
Yunfei Gong
Visual Computing Research Center, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Hui Huang
Hui Huang
Chair Professor and CS Dean, Shenzhen University
GraphicsGeometryPointsShapesImages