Assumption-lean Inference for Network-linked Data

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Classical assumptions—such as strong homogeneity or graph denseness—frequently fail in latent-variable-driven network link regression, undermining the validity of standard linear regression inference. Method: We propose a linear regression framework valid under weaker, more realistic assumptions. Leveraging the Aldous–Hoover representation, we construct a jointly exchangeable regression array and develop a bias-corrected estimator that achieves consistent and asymptotically normal inference for natural target parameters—even under weak graph sparsity. The method employs network summary statistics—including local subgraph frequencies and spectral embeddings—as covariates, and its theoretical foundation ensures bootstrap consistency. Results: Extensive simulations and empirical analysis on real-world elementary-school academic interaction data demonstrate that our approach substantially improves robustness and accuracy in network link regression inference, effectively relaxing reliance on restrictive homogeneity or denseness assumptions.

Technology Category

Application Category

📝 Abstract

We consider statistical inference for network-linked regression problems, where covariates may include network summary statistics computed for each node. In settings involving network data, it is often natural to posit that latent variables govern connection probabilities in the graph. Since the presence of these latent features makes classical regression assumptions even less tenable, we propose an assumption-lean framework for linear regression with jointly exchangeable regression arrays. We establish an analog of the Aldous-Hoover representation for such arrays, which may be of independent interest. Moreover, we consider two different projection parameters as potential targets and establish conditions under which asymptotic normality and bootstrap consistency hold when commonly used network statistics, including local subgraph frequencies and spectral embeddings, are used as covariates. In the case of linear regression with local count statistics, we show that a bias-corrected estimator allows one to target a more natural inferential target under weaker sparsity conditions compared to the OLS estimator. Our inferential tools are illustrated using both simulated data and real data related to the academic climate of elementary schools.

Problem

Research questions and friction points this paper is trying to address.

Statistical inference for network-linked regression problems

Addressing latent variables in network connection probabilities

Developing assumption-lean framework for exchangeable regression arrays

Innovation

Methods, ideas, or system contributions that make the work stand out.

Assumption-lean framework for network-linked regression

Bias-corrected estimator for linear regression

Aldous-Hoover representation for exchangeable arrays

🔎 Similar Papers

No similar papers found.