🤖 AI Summary
Real-world networks commonly exhibit structural sparsity, heavy-tailed edge-weight distributions, and excess zeros—features poorly captured by classical random graph models (e.g., $G(N,p)$, configuration model, stochastic block model). To address this, we propose, for the first time, a systematic integration of zero-inflation mechanisms into canonical network models, explicitly modeling both excess zero-valued edges and heavy-tailed edge-weight distributions. We identify sparsity as a universal structural constraint in empirical networks. Through rigorous theoretical formulation, consistent parameter estimation, and statistical hypothesis testing, we validate our zero-inflated extensions across the entire Sociopatterns dataset. Results demonstrate statistically significant improvements in goodness-of-fit for both sparse topology and edge-weight distribution, substantial reduction in modeling bias, and enhanced representational fidelity for capturing complex system dynamics.
📝 Abstract
Real-world networks are sparse. As we show in this article, even when a large number of interactions is observed, most node pairs remain disconnected. We demonstrate that classical multi-edge network models, such as the $G(N,p)$, configuration models, and stochastic block models, fail to accurately capture this phenomenon. To mitigate this issue, zero-inflation must be integrated into these traditional models. Through zero-inflation, we incorporate a mechanism that accounts for the excess number of zeroes (disconnected pairs) observed in empirical data. By performing an analysis on all the datasets from the Sociopatterns repository, we illustrate how zero-inflated models more accurately reflect the sparsity and heavy-tailed edge count distributions observed in empirical data. Our findings underscore that failing to account for these ubiquitous properties in real-world networks inadvertently leads to biased models that do not accurately represent complex systems and their dynamics.