Full-Graph vs. Mini-Batch Training: Comprehensive Analysis from a Batch Size and Fan-Out Size Perspective

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This study investigates the trade-off between full-graph and mini-batch training strategies in graph neural networks (GNNs) with respect to performance and computational efficiency. By introducing a generalization error analysis framework based on Wasserstein distance, the work systematically evaluates how batch size and neighbor sampling (fan-out) jointly influence convergence behavior, generalization capability, and computational overhead. Both theoretical analysis and empirical results demonstrate that full-graph training is not universally superior; under resource-constrained settings, carefully tuned mini-batch training can achieve a more favorable balance between model performance and efficiency. These findings offer principled guidance for hyperparameter selection in practical GNN applications.

Technology Category

Application Category

📝 Abstract

Full-graph and mini-batch Graph Neural Network (GNN) training approaches have distinct system design demands, making it crucial to choose the appropriate approach to develop. A core challenge in comparing these two GNN training approaches lies in characterizing their model performance (i.e., convergence and generalization) and computational efficiency. While a batch size has been an effective lens in analyzing such behaviors in deep neural networks (DNNs), GNNs extend this lens by introducing a fan-out size, as full-graph training can be viewed as mini-batch training with the largest possible batch size and fan-out size. However, the impact of the batch and fan-out size for GNNs remains insufficiently explored. To this end, this paper systematically compares full-graph vs. mini-batch training of GNNs through empirical and theoretical analyses from the view points of the batch size and fan-out size. Our key contributions include: 1) We provide a novel generalization analysis using the Wasserstein distance to study the impact of the graph structure, especially the fan-out size. 2) We uncover the non-isotropic effects of the batch size and the fan-out size in GNN convergence and generalization, providing practical guidance for tuning these hyperparameters under resource constraints. Finally, full-graph training does not always yield better model performance or computational efficiency than well-tuned smaller mini-batch settings. The implementation can be found in the github link: https://github.com/LIUMENGFAN-gif/GNN_fullgraph_minibatch_training.

Problem

Research questions and friction points this paper is trying to address.

Graph Neural Networks

Full-Graph Training

Mini-Batch Training

Batch Size

Fan-Out Size

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Neural Networks

Full-Graph Training

Mini-Batch Training