EmbedPart: Embedding-Driven Graph Partitioning for Scalable Graph Neural Network Training

📅 2026-04-01

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the challenge of balancing efficiency and quality in graph partitioning for large-scale graph neural network (GNN) training. The authors propose a novel approach that, for the first time, shifts graph partitioning from the original sparse topological space to the dense node embedding space generated during GNN training. By leveraging efficient clustering in this embedding space, the method achieves high-quality partitions with substantially reduced computational overhead. It further enables rapid repartitioning and graph reordering, significantly enhancing scalability. Experimental results demonstrate that the proposed technique accelerates partitioning by over 100× compared to Metis while preserving partition quality, thereby effectively speeding up both single-machine and distributed GNN training.

Technology Category

Application Category

📝 Abstract

Graph Neural Networks (GNNs) are widely used for learning on graph-structured data, but scaling GNN training to massive graphs remains challenging. To enable scalable distributed training, graphs are divided into smaller partitions that are distributed across multiple machines such that inter-machine communication is minimized and computational load is balanced. In practice, existing partitioning approaches face a fundamental trade-off between partitioning overhead and partitioning quality. We propose EmbedPart, an embedding-driven partitioning approach that achieves both speed and quality. Instead of operating directly on irregular graph structures, EmbedPart leverages node embeddings produced during the actual GNN training workload and clusters these dense embeddings to derive a partitioning. EmbedPart achieves more than 100x speedup over Metis while maintaining competitive partitioning quality and accelerating distributed GNN training. Moreover, EmbedPart naturally supports graph updates and fast repartitioning, and can be applied to graph reordering to improve data locality and accelerate single-machine GNN training. By shifting partitioning from irregular graph structures to dense embeddings, EmbedPart enables scalable and high-quality graph data optimization.

Problem

Research questions and friction points this paper is trying to address.

Graph Neural Networks

Graph Partitioning

Scalable Training

Distributed Computing

Partitioning Quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

embedding-driven partitioning

graph neural networks

scalable training