🤖 AI Summary
This work addresses the challenges of high communication overhead and performance degradation caused by non-independent and identically distributed (non-IID) data in federated graph neural networks. To this end, the authors propose CeFGC, a novel framework that introduces generative diffusion models into federated graph learning for the first time. Each client trains a diffusion model on its local data to generate synthetic graphs, which are then combined with real graphs to train a local graph neural network (GNN). Global model aggregation is achieved within only three communication rounds, reducing the communication complexity to a constant level and substantially mitigating the generalization difficulties posed by non-IID data. Experimental results demonstrate that CeFGC achieves superior node classification accuracy with significantly lower communication costs across multiple real-world graph datasets, with particularly pronounced advantages under non-IID settings.
📝 Abstract
Graph Neural Networks (GNNs) unlock new ways of learning from graph-structured data, proving highly effective in capturing complex relationships and patterns. Federated GNNs (FGNNs) have emerged as a prominent distributed learning paradigm for training GNNs over decentralized data. However, FGNNs face two significant challenges: high communication overhead from multiple rounds of parameter exchanges and non-IID data characteristics across clients. To address these issues, we introduce CeFGC, a novel FGNN paradigm that facilitates efficient GNN training over non-IID data by limiting communication between the server and clients to three rounds only. The core idea of CeFGC is to leverage generative diffusion models to minimize direct client-server communication. Each client trains a generative diffusion model that captures its local graph distribution and shares this model with the server, which then redistributes it back to all clients. Using these generative models, clients generate synthetic graphs combined with their local graphs to train local GNN models. Finally, clients upload their model weights to the server for aggregation into a global GNN model. We theoretically analyze the I/O complexity of communication volume to show that CeFGC reduces to a constant of three communication rounds only. Extensive experiments on several real graph datasets demonstrate the effectiveness and efficiency of CeFGC against state-of-the-art competitors, reflecting our superior performance on non-IID graphs by aligning local and global model objectives and enriching the training set with diverse graphs.