Network signflip parallel analysis for selecting the embedding dimension

📅 2025-09-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In large heterogeneous networks, community structures exhibit weak separability, and determining the optimal embedding dimension remains challenging. Method: We propose a parameter-free spectral dimension selection method based on spectral analysis. Specifically, we construct a null reference matrix via random sign flipping of the normalized adjacency matrix and leverage the local law for inhomogeneous Wigner-type matrices to rigorously derive a non-asymptotic upper bound on spectral noise. This enables precise identification of the spectral edge separating informative signal from noise. Contribution/Results: This work is the first to integrate sign flipping with parallel spectral analysis for embedding dimension selection—eliminating hyperparameter tuning and enhancing interpretability. Under the degree-corrected stochastic block model, our method consistently identifies the spectral noise edge, significantly outperforming conventional eigenvalue-divergence-based truncation approaches, especially in settings with high degree heterogeneity and multiple communities.

Technology Category

Application Category

📝 Abstract
This paper investigates the problem of selecting the embedding dimension for large heterogeneous networks that have weakly distinguishable community structure. For a broad family of embeddings based on normalized adjacency matrices, we introduce a novel spectral method that compares the eigenvalues of the normalized adjacency matrix to those obtained after randomly signflipping its entries. The proposed method, called network signflip parallel analysis (NetFlipPA), is interpretable, simple to implement, data driven, and does not require users to carefully tune parameters. For large random graphs arising from degree-corrected stochastic blockmodels with weakly distinguishable community structure (and consequently, non-diverging eigenvalues), NetFlipPA provably recovers the spectral noise floor (i.e., the upper-edge of the eigenvalues corresponding to the noise component of the normalized adjacency matrix). NetFlipPA thus provides a statistically rigorous randomization-based method for selecting the embedding dimension by keeping the eigenvalues that rise above the recovered spectral noise floor. Compared to traditional cutoff-based methods, the data-driven threshold used in NetFlipPA is provably effective under milder assumptions on the node degree heterogeneity and the number of node communities. Our main results rely on careful non-asymptotic perturbation analysis and leverage recent progress on local laws for nonhomogeneous Wigner-type random matrices.
Problem

Research questions and friction points this paper is trying to address.

Selecting embedding dimension for large heterogeneous networks
Identifying spectral noise floor in normalized adjacency matrices
Handling weakly distinguishable community structures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random signflipping of adjacency matrix entries
Comparing eigenvalues to identify spectral noise floor
Data-driven threshold for embedding dimension selection
🔎 Similar Papers
No similar papers found.