EmbedPart: Embedding-Driven Graph Partitioning for Scalable Graph Neural Network Training

๐Ÿ“… 2026-04-01
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of balancing efficiency and quality in graph partitioning for large-scale graph neural network (GNN) training. The authors propose a novel approach that, for the first time, shifts graph partitioning from the original sparse topological space to the dense node embedding space generated during GNN training. By leveraging efficient clustering in this embedding space, the method achieves high-quality partitions with substantially reduced computational overhead. It further enables rapid repartitioning and graph reordering, significantly enhancing scalability. Experimental results demonstrate that the proposed technique accelerates partitioning by over 100ร— compared to Metis while preserving partition quality, thereby effectively speeding up both single-machine and distributed GNN training.
๐Ÿ“ Abstract
Graph Neural Networks (GNNs) are widely used for learning on graph-structured data, but scaling GNN training to massive graphs remains challenging. To enable scalable distributed training, graphs are divided into smaller partitions that are distributed across multiple machines such that inter-machine communication is minimized and computational load is balanced. In practice, existing partitioning approaches face a fundamental trade-off between partitioning overhead and partitioning quality. We propose EmbedPart, an embedding-driven partitioning approach that achieves both speed and quality. Instead of operating directly on irregular graph structures, EmbedPart leverages node embeddings produced during the actual GNN training workload and clusters these dense embeddings to derive a partitioning. EmbedPart achieves more than 100x speedup over Metis while maintaining competitive partitioning quality and accelerating distributed GNN training. Moreover, EmbedPart naturally supports graph updates and fast repartitioning, and can be applied to graph reordering to improve data locality and accelerate single-machine GNN training. By shifting partitioning from irregular graph structures to dense embeddings, EmbedPart enables scalable and high-quality graph data optimization.
Problem

Research questions and friction points this paper is trying to address.

Graph Neural Networks
Graph Partitioning
Scalable Training
Distributed Computing
Partitioning Quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

embedding-driven partitioning
graph neural networks
scalable training
distributed graph processing
graph reordering
๐Ÿ”Ž Similar Papers
No similar papers found.