🤖 AI Summary
To address the high memory overhead and low training efficiency of Graph Neural Networks (GNNs) on large-scale graphs—caused by neighborhood explosion—this paper proposes a single-GPU scalable framework. First, it introduces LPMetis, a lightweight graph partitioning algorithm that mitigates boundary distortion in subgraphs. Second, it incorporates subgraph augmentation and cache-aware message-passing optimization to reduce redundant computation and GPU memory consumption. Third, it establishes a unified training architecture compatible with diverse GNN models. The framework enables end-to-end training on billion-edge-scale graphs; in Tencent’s production environment, it completes training within 10 hours. On user acquisition tasks, it outperforms state-of-the-art models by 8.24%–13.89% in accuracy, achieving a favorable trade-off among prediction precision, resource efficiency, and industrial deployability.
📝 Abstract
Graph Neural Networks (GNNs) have emerged as powerful tools for various graph mining tasks, yet existing scalable solutions often struggle to balance execution efficiency with prediction accuracy. These difficulties stem from iterative message-passing techniques, which place significant computational demands and require extensive GPU memory, particularly when dealing with the neighbor explosion issue inherent in large-scale graphs. This paper introduces a scalable, low-cost, flexible, and efficient GNN framework called LPS-GNN, which can perform representation learning on 100 billion graphs with a single GPU in 10 hours and shows a 13.8% improvement in User Acquisition scenarios. We examine existing graph partitioning methods and design a superior graph partition algorithm named LPMetis. In particular, LPMetis outperforms current state-of-the-art (SOTA) approaches on various evaluation metrics. In addition, our paper proposes a subgraph augmentation strategy to enhance the model's predictive performance. It exhibits excellent compatibility, allowing the entire framework to accommodate various GNN algorithms. Successfully deployed on the Tencent platform, LPS-GNN has been tested on public and real-world datasets, achieving performance lifts of 8. 24% to 13. 89% over SOTA models in online applications.