CondenseGraph: Communication-Efficient Distributed GNN Training via On-the-Fly Graph Condensation

📅 2026-01-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high communication overhead in distributed graph neural network (GNN) training, primarily caused by neighbor dependencies that necessitate frequent exchange of boundary node features, creating a communication bottleneck. To mitigate this, the authors propose a communication-efficient training framework featuring a novel dynamic on-the-fly graph compression mechanism that aggregates boundary nodes into compact super-nodes to reduce data transmission. Additionally, a gradient error feedback strategy is integrated to compensate for information loss due to compression, thereby preserving model convergence and accuracy. Experimental results on four benchmark datasets demonstrate that the proposed method reduces communication volume by 40%–60%, significantly accelerates training time, and maintains accuracy comparable to that of full-precision baselines.

Technology Category

Application Category

📝 Abstract
Distributed Graph Neural Network (GNN) training suffers from substantial communication overhead due to the inherent neighborhood dependency in graph-structured data. This neighbor explosion problem requires workers to frequently exchange boundary node features across partitions, creating a communication bottleneck that severely limits training scalability. Existing approaches rely on static graph partitioning strategies that cannot adapt to dynamic network conditions. In this paper, we propose CondenseGraph, a novel communication-efficient framework for distributed GNN training. Our key innovation is an on-the-fly graph condensation mechanism that dynamically compresses boundary node features into compact super nodes before transmission. To compensate for the information loss introduced by compression, we develop a gradient-based error feedback mechanism that maintains convergence guarantees while reducing communication volume by 40-60%. Extensive experiments on four benchmark datasets demonstrate that CondenseGraph achieves comparable accuracy to full-precision baselines while significantly reducing communication costs and training time.
Problem

Research questions and friction points this paper is trying to address.

distributed GNN training
communication overhead
neighbor explosion
graph partitioning
boundary node features
Innovation

Methods, ideas, or system contributions that make the work stand out.

graph condensation
distributed GNN training
communication efficiency
error feedback
dynamic compression
Z
Zizhao Zhang
University of Michigan
Y
Yihan Xue
University of Southern California
H
Haotian Zhu
New York University
Sijia Li
Sijia Li
Institute of Information Engineering, Chinese Academy of Sciences
Zhijun Wang
Zhijun Wang
Institute of Physics, Chinese Academy of Sciences
Condensed Matter Physics
Y
Yujie Xiao
University of California, Berkeley