🤖 AI Summary
The lack of a statistically formalized characterization of core-periphery structure remains a fundamental challenge. Method: This paper introduces the first model-agnostic statistical inference framework: (i) it defines an interpretable core strength parameter quantifying the intrinsic structural intensity in generative mechanisms; (ii) it proposes an intersection-based hypothesis testing procedure under multiple null hypotheses to rigorously disentangle endogenous core-periphery structure from exogenous confounding effects; and (iii) it establishes theoretical guarantees for exact label recovery and principled criteria for structural significance testing. The approach integrates random graph theory, group-level parametric modeling, and rigorous frequentist hypothesis testing—ensuring simultaneous control of Type I error and high statistical power. Results: Experiments on synthetic networks demonstrate high accuracy and robustness. Empirical analysis across real-world networks reveals that statistically significant core-periphery structure is surprisingly rare, suggesting that many previously reported observations may arise from noise or unmeasured confounders.
📝 Abstract
Core-periphery (CP) structure is an important meso-scale network property where nodes group into a small, densely interconnected {core} and a sparse {periphery} whose members primarily connect to the core rather than to each other. While this structure has been observed in numerous real-world networks, there has been minimal statistical formalization of it. In this work, we develop a statistical framework for CP structures by introducing a model-agnostic and generalizable population parameter which quantifies the strength of a CP structure at the level of the data-generating mechanism. We study this parameter under four canonical random graph models and establish theoretical guarantees for label recovery, including exact label recovery. Next, we construct intersection tests for validating the presence and strength of a CP structure under multiple null models, and prove theoretical guarantees for type I error and power. These tests provide a formal distinction between exogenous (or induced) and endogenous (or intrinsic) CP structure in heterogeneous networks, enabling a level of structural resolution that goes beyond merely detecting the presence of CP structure. The proposed methods show excellent performance on synthetic data, and our applications demonstrate that statistically significant CP structure is somewhat rare in real-world networks.