Incremental GNN Embedding Computation on Streaming Graphs

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of graph neural network (GNN) inference on streaming graphs, where runtime embedding computation suffers from frequent and costly multi-hop traversals. To overcome this, the authors propose an efficient incremental computation framework that decouples GNN message passing into fine-grained, generic operators and reorders their execution to update embeddings only within affected subgraphs, thereby eliminating redundant computations. This approach is the first to support general-purpose incremental embedding updates under complex message-passing patterns while preserving model accuracy and substantially reducing computational overhead. Furthermore, it integrates GPU-CPU cooperative memory management and communication scheduling to handle large-scale historical embeddings. Experiments across diverse graph sizes and GNN architectures demonstrate a 64%–99% reduction in computation volume and speedups ranging from 1.7× to 145.8× over existing methods.

Technology Category

Application Category

📝 Abstract
Graph Neural Network (GNN) on streaming graphs has gained increasing popularity. However, its practical deployment remains challenging, as the inference process relies on Runtime Embedding Computation (RTEC) to capture recent graph changes. This process incurs heavyweight multi-hop graph traversal overhead, which significantly undermines computation efficiency. We observe that the intermediate results for large portions of the graph remain unchanged during graph evolution, and thus redundant computations can be effectively eliminated through carefully designed incremental methods. In this work, we propose an efficient framework for incrementalizing RTEC on streaming graphs.The key idea is to decouple GNN computation into a set of generalized, fine-grained operators and safely reorder them, transforming the expensive full-neighbor GNN computation into a more efficient form over the affected subgraph. With this design, our framework preserves the semantics and accuracy of the original full-neighbor computation while supporting a wide range of GNN models with complex message-passing patterns. To further scale to graphs with massive historical results, we develop a GPU-CPU co-processing system that offloads embeddings to CPU memory with communication-optimized scheduling. Experiments across diverse graph sizes and GNN models show that our method reduces computation by 64%-99% and achieves 1.7x-145.8x speedups over existing solutions.
Problem

Research questions and friction points this paper is trying to address.

Graph Neural Network
Streaming Graphs
Runtime Embedding Computation
Incremental Computation
Multi-hop Graph Traversal
Innovation

Methods, ideas, or system contributions that make the work stand out.

Incremental GNN
Streaming Graphs
Runtime Embedding Computation
GPU-CPU Co-processing
Fine-grained Operator Reordering
🔎 Similar Papers
No similar papers found.
Q
Qiange Wang
National University of Singapore, Singapore
H
Haoran Lv
School of Computer Science and Engineering, Northeastern University, Shenyang, China
Yanfeng Zhang
Yanfeng Zhang
Northeastern University, China
Database SystemsMachine Learning Systems
Weng-Fai Wong
Weng-Fai Wong
Associate Professor of Computer Science, National University of Singapore
Computer architecturecompilershigh performance computingembedded systemsparallel processing
B
Bingsheng He
National University of Singapore, Singapore