A Comprehensive Data-centric Overview of Federated Graph Learning

📅 2025-07-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing surveys on federated graph learning (FGL) predominantly emphasize algorithmic designs and simulation-based evaluations, lacking a systematic taxonomy grounded in data characteristics and usage patterns—thus hindering data-centric performance optimization. To address this gap, we propose the first data-oriented, two-tier classification framework: the first tier categorizes FGL settings along three orthogonal dimensions of data structure and distribution properties; the second tier organizes methodologies along three orthogonal dimensions of training workflow and technical enablers. Leveraging this framework, we systematically analyze cross-device collaborative training, privacy-preserving mechanisms, and large-model integration strategies, while covering representative real-world application scenarios. Our work not only redefines the FGL research paradigm but also uncovers fundamental modeling principles under data constraints, providing both theoretical foundations and actionable technical pathways for enhancing model performance.

Technology Category

Application Category

📝 Abstract
In the era of big data applications, Federated Graph Learning (FGL) has emerged as a prominent solution that reconcile the tradeoff between optimizing the collective intelligence between decentralized datasets holders and preserving sensitive information to maximum. Existing FGL surveys have contributed meaningfully but largely focus on integrating Federated Learning (FL) and Graph Machine Learning (GML), resulting in early stage taxonomies that emphasis on methodology and simulated scenarios. Notably, a data centric perspective, which systematically examines FGL methods through the lens of data properties and usage, remains unadapted to reorganize FGL research, yet it is critical to assess how FGL studies manage to tackle data centric constraints to enhance model performances. This survey propose a two-level data centric taxonomy: Data Characteristics, which categorizes studies based on the structural and distributional properties of datasets used in FGL, and Data Utilization, which analyzes the training procedures and techniques employed to overcome key data centric challenges. Each taxonomy level is defined by three orthogonal criteria, each representing a distinct data centric configuration. Beyond taxonomy, this survey examines FGL integration with Pretrained Large Models, showcases realistic applications, and highlights future direction aligned with emerging trends in GML.
Problem

Research questions and friction points this paper is trying to address.

Examines Federated Graph Learning from a data-centric perspective
Proposes taxonomy for data characteristics and utilization in FGL
Assesses FGL's ability to handle data constraints for performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-level data-centric taxonomy for FGL
Integrates Pretrained Large Models with FGL
Focuses on Data Characteristics and Utilization
🔎 Similar Papers
2024-02-29International Conference on Learning RepresentationsCitations: 0
2024-02-22Citations: 1
Zhengyu Wu
Zhengyu Wu
北京理工大学在读博士生;Phd Candidate at Beijing Institute of Technology
Federated Graph LearningData-centric Graph LearningPre-trained Large Models with Graph Learning
Xunkai Li
Xunkai Li
School of Computer Science and Technology, Beijing Institution of Technology
Data-centric AIGraph MLAI4Science
Yinlin Zhu
Yinlin Zhu
Sun Yat-sen University
Graph Neural NetworksFederated Learning
Z
Zekai Chen
Beijing Institute of Technology, Beijing, 100811, China
Guochen Yan
Guochen Yan
Peking University
GraphFederated learningTrustworthy AI
Y
Yanyu Yan
Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China
H
Hao Zhang
Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China
Y
Yuming Ai
Beijing Institute of Technology, Beijing, 100811, China
X
Xinmo Jin
Beijing Institute of Technology, Beijing, 100811, China
Rong-Hua Li
Rong-Hua Li
Beijing Institute of Technology
Algorithms for (big) graphmatrixand sequence data
Guoren Wang
Guoren Wang
Beijing Institute of Technology