Incremental GNN Embedding Computation on Streaming Graphs

📅 2026-03-20

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the inefficiency of graph neural network (GNN) inference on streaming graphs, where runtime embedding computation suffers from frequent and costly multi-hop traversals. To overcome this, the authors propose an efficient incremental computation framework that decouples GNN message passing into fine-grained, generic operators and reorders their execution to update embeddings only within affected subgraphs, thereby eliminating redundant computations. This approach is the first to support general-purpose incremental embedding updates under complex message-passing patterns while preserving model accuracy and substantially reducing computational overhead. Furthermore, it integrates GPU-CPU cooperative memory management and communication scheduling to handle large-scale historical embeddings. Experiments across diverse graph sizes and GNN architectures demonstrate a 64%–99% reduction in computation volume and speedups ranging from 1.7× to 145.8× over existing methods.

Technology Category

Application Category

📝 Abstract

Graph Neural Network (GNN) on streaming graphs has gained increasing popularity. However, its practical deployment remains challenging, as the inference process relies on Runtime Embedding Computation (RTEC) to capture recent graph changes. This process incurs heavyweight multi-hop graph traversal overhead, which significantly undermines computation efficiency. We observe that the intermediate results for large portions of the graph remain unchanged during graph evolution, and thus redundant computations can be effectively eliminated through carefully designed incremental methods. In this work, we propose an efficient framework for incrementalizing RTEC on streaming graphs.The key idea is to decouple GNN computation into a set of generalized, fine-grained operators and safely reorder them, transforming the expensive full-neighbor GNN computation into a more efficient form over the affected subgraph. With this design, our framework preserves the semantics and accuracy of the original full-neighbor computation while supporting a wide range of GNN models with complex message-passing patterns. To further scale to graphs with massive historical results, we develop a GPU-CPU co-processing system that offloads embeddings to CPU memory with communication-optimized scheduling. Experiments across diverse graph sizes and GNN models show that our method reduces computation by 64%-99% and achieves 1.7x-145.8x speedups over existing solutions.

Problem

Research questions and friction points this paper is trying to address.

Graph Neural Network

Streaming Graphs

Runtime Embedding Computation

Incremental Computation

Multi-hop Graph Traversal

Innovation

Methods, ideas, or system contributions that make the work stand out.

Incremental GNN

Streaming Graphs

Runtime Embedding Computation