🤖 AI Summary
To address memory overflow and excessive computational overhead caused by neighborhood explosion in mini-batch GNN training on large-scale graphs, this paper proposes Topology Compensation (TOP), the first method enabling exact reconstruction of full-graph message passing outputs using only intra-batch message passing (MP-IB). Its core innovation is the introduction of *message invariance*, a novel principle realized via learnable invariant transformations that equivalently convert costly out-of-batch message passing (MP-OB) into intra-batch computation—eliminating sampling, approximation, and extra neighbor loading. TOP integrates GNN layer decoupling with GPU-efficient subgraph-local computation. Evaluated on graphs with tens of millions of nodes and billions of edges, TOP achieves over 10× training speedup with <0.5% accuracy degradation, significantly advancing the scalability frontier of GNNs.
📝 Abstract
Message passing-based graph neural networks (GNNs) have achieved great success in many real-world applications. For a sampled mini-batch of target nodes, the message passing process is divided into two parts: message passing between nodes within the batch (MP-IB) and message passing from nodes outside the batch to those within it (MP-OB). However, MP-OB recursively relies on higher-order out-of-batch neighbors, leading to an exponentially growing computational cost with respect to the number of layers. Due to the neighbor explosion, the whole message passing stores most nodes and edges on the GPU such that many GNNs are infeasible to large-scale graphs. To address this challenge, we propose an accurate and fast mini-batch approach for large graph transductive learning, namely topological compensation (TOP), which obtains the outputs of the whole message passing solely through MP-IB, without the costly MP-OB. The major pillar of TOP is a novel concept of message invariance, which defines message-invariant transformations to convert costly MP-OB into fast MP-IB. This ensures that the modified MP-IB has the same output as the whole message passing. Experiments demonstrate that TOP is significantly faster than existing mini-batch methods by order of magnitude on vast graphs (millions of nodes and billions of edges) with limited accuracy degradation.