Graph Sampling for Scalable and Expressive Graph Neural Networks on Homophilic Graphs

📅 2024-10-22
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scalability and degraded expressiveness of Graph Neural Networks (GNNs) on large-scale homogeneous graphs, this paper proposes a novel graph sampling algorithm grounded in feature homophily. Unlike conventional random sampling—which often yields disconnected subgraphs—our method is the first to explicitly incorporate feature homophily into the sampling objective. Specifically, it minimizes the trace of the data-dependent matrix to approximately preserve the graph Laplacian trace, thereby jointly ensuring subgraph connectivity, structural fidelity, and computational efficiency. Experiments on citation networks demonstrate that our algorithm significantly improves Laplacian trace preservation—by up to 37% over random sampling—and boosts cross-graph GNN transfer accuracy by an average of 2.1%. These results substantiate enhanced model transferability and generalization capability.

Technology Category

Application Category

📝 Abstract
Graph Neural Networks (GNNs) excel in many graph machine learning tasks but face challenges when scaling to large networks. GNN transferability allows training on smaller graphs and applying the model to larger ones, but existing methods often rely on random subsampling, leading to disconnected subgraphs and reduced model expressivity. We propose a novel graph sampling algorithm that leverages feature homophily to preserve graph structure. By minimizing the trace of the data correlation matrix, our method better preserves the graph Laplacian trace -- a proxy for the graph connectivity -- than random sampling, while achieving lower complexity than spectral methods. Experiments on citation networks show improved performance in preserving Laplacian trace and GNN transferability compared to random sampling.
Problem

Research questions and friction points this paper is trying to address.

Addresses scalability of Graph Neural Networks on large graphs.
Improves GNN transferability by preserving graph structure during sampling.
Proposes a homophily-based sampling method to maintain graph connectivity.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages feature homophily for graph sampling
Minimizes trace of data correlation matrix
Improves Laplacian trace preservation and GNN transferability
🔎 Similar Papers