🤖 AI Summary
Classical assumptions—such as strong homogeneity or graph denseness—frequently fail in latent-variable-driven network link regression, undermining the validity of standard linear regression inference.
Method: We propose a linear regression framework valid under weaker, more realistic assumptions. Leveraging the Aldous–Hoover representation, we construct a jointly exchangeable regression array and develop a bias-corrected estimator that achieves consistent and asymptotically normal inference for natural target parameters—even under weak graph sparsity. The method employs network summary statistics—including local subgraph frequencies and spectral embeddings—as covariates, and its theoretical foundation ensures bootstrap consistency.
Results: Extensive simulations and empirical analysis on real-world elementary-school academic interaction data demonstrate that our approach substantially improves robustness and accuracy in network link regression inference, effectively relaxing reliance on restrictive homogeneity or denseness assumptions.
📝 Abstract
We consider statistical inference for network-linked regression problems, where covariates may include network summary statistics computed for each node. In settings involving network data, it is often natural to posit that latent variables govern connection probabilities in the graph. Since the presence of these latent features makes classical regression assumptions even less tenable, we propose an assumption-lean framework for linear regression with jointly exchangeable regression arrays. We establish an analog of the Aldous-Hoover representation for such arrays, which may be of independent interest. Moreover, we consider two different projection parameters as potential targets and establish conditions under which asymptotic normality and bootstrap consistency hold when commonly used network statistics, including local subgraph frequencies and spectral embeddings, are used as covariates. In the case of linear regression with local count statistics, we show that a bias-corrected estimator allows one to target a more natural inferential target under weaker sparsity conditions compared to the OLS estimator. Our inferential tools are illustrated using both simulated data and real data related to the academic climate of elementary schools.