MixGCN: Scalable GCN Training by Mixture of Parallelism and Mixture of Accelerators

📅 2025-01-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address memory explosion, inefficient resource utilization due to alternating sparse/dense computations, and high communication overhead with incomplete memory relief in existing partitioning-based parallel methods for full-graph GCN training, this paper proposes a synergistic optimization framework integrating hybrid parallelism and heterogeneous acceleration. Its key contributions are: (1) a hybrid parallelism strategy with theoretical guarantees of constant communication volume; (2) a fine-grained pipelined architecture for a sparse/dense heterogeneous GCN accelerator; and (3) a co-optimized scheduling mechanism with operator fusion. Experiments on large-scale graphs demonstrate that the framework achieves controllable memory footprint, improved load balancing, and significant end-to-end training speedup. Crucially, communication volume remains invariant with graph scale, while delivering strong scalability and hardware adaptability.

Technology Category

Application Category

📝 Abstract
Graph convolutional networks (GCNs) have demonstrated superiority in graph-based learning tasks. However, training GCNs on full graphs is particularly challenging, due to the following two challenges: (1) the associated feature tensors can easily explode the memory and block the communication bandwidth of modern accelerators, and (2) the computation workflow in training GCNs alternates between sparse and dense matrix operations, complicating the efficient utilization of computational resources. Existing solutions for scalable distributed full-graph GCN training mostly adopt partition parallelism, which is unsatisfactory as they only partially address the first challenge while incurring scaled-out communication volume. To this end, we propose MixGCN aiming to simultaneously address both the aforementioned challenges towards GCN training. To tackle the first challenge, MixGCN integrates mixture of parallelism. Both theoretical and empirical analysis verify its constant communication volumes and enhanced balanced workload; For handling the second challenge, we consider mixture of accelerators (i.e., sparse and dense accelerators) with a dedicated accelerator for GCN training and a fine-grain pipeline. Extensive experiments show that MixGCN achieves boosted training efficiency and scalability.
Problem

Research questions and friction points this paper is trying to address.

Graph Convolutional Networks
Memory Limitations
Parallel Processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

MixGCN
Parallel Processing
Data Management
🔎 Similar Papers
No similar papers found.
Cheng Wan
Cheng Wan
Georgia Institute of Technology
R
Runkao Tao
Rutgers University
Z
Zheng Du
University of Minnesota Twin Cities
Yang Katie Zhao
Yang Katie Zhao
Assistant Professor, University of Minnesota, Twin Cities
Computer ArchitectureDomain-Specific AccelerationMachine Learning
Y
Yingyan Celine Lin
Georgia Institute of Technology