🤖 AI Summary
In crowd counting, density maps are highly sparse (>95% zeros), rendering conventional MSE-based density estimation methods ineffective in sparse regions; moreover, their Gaussian assumption contradicts the discrete, non-negative nature of count data. This paper introduces zero-inflated Poisson (ZIP) regression—the first such application in crowd counting—to establish a probabilistic density estimation framework. ZIP explicitly models both zero inflation and count discreteness, and optimization is performed via negative log-likelihood. We further integrate an Enhanced Block Classification (EBC) architecture, compatible with diverse backbone networks. Our method consistently outperforms EBC and state-of-the-art approaches on four major benchmarks—ShanghaiTech Part_A, ShanghaiTech Part_B, UCF-QNRF, and JHU-CROWD++—achieving significant improvements in sparse-region accuracy. The approach offers strong theoretical grounding, superior generalization, and principled handling of sparsity and discreteness inherent in crowd counting.
📝 Abstract
Density map estimation has become the mainstream paradigm in crowd counting. However, most existing methods overlook the extreme sparsity of ground-truth density maps. In real-world crowd scenes, the vast majority of spatial regions (often over 95%) contain no people, leading to heavily imbalanced count distributions. Ignoring this imbalance can bias models toward overestimating dense regions and underperforming in sparse areas. Furthermore, most loss functions used in density estimation are majorly based on MSE and implicitly assume Gaussian distributions, which are ill-suited for modeling discrete, non-negative count data. In this paper, we propose EBC-ZIP, a crowd counting framework that models the spatial distribution of counts using a Zero-Inflated Poisson (ZIP) regression formulation. Our approach replaces the traditional regression loss with the negative log-likelihood of the ZIP distribution, enabling better handling of zero-heavy distributions while preserving count accuracy. Built upon the recently proposed Enhanced Block Classification (EBC) framework, EBC-ZIP inherits EBC's advantages in preserving the discreteness of targets and ensuring training stability, while further improving performance through a more principled probabilistic loss. We also evaluate EBC-ZIP with backbones of varying computational complexity to assess its scalability. Extensive experiments on four crowd counting benchmarks demonstrate that EBC-ZIP consistently outperforms EBC and achieves state-of-the-art results.