GiGL: Large-Scale Graph Neural Networks at Snapchat

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address scalability bottlenecks in industrial-scale, billion-node graph neural network (GNN) deployment—particularly for training, inference, and serving on social graphs at Snapchat’s scale—this paper introduces GiGL, the first open-source framework enabling end-to-end productionization of GNNs. GiGL integrates distributed graph computation, relational-database-driven subgraph sampling preprocessing, and PyTorch Geometric–based modeling into a unified, Kubernetes-native pipeline. Its key innovation lies in jointly optimizing system performance and algorithmic flexibility within a single architecture. Deployed at Snapchat, GiGL powers over 35 production models, processing more than one billion nodes daily. It delivers measurable improvements in critical metrics—including AUC and CTR—across core applications such as friend recommendation, content distribution, and ad targeting.

Technology Category

Application Category

📝 Abstract
Recent advances in graph machine learning (ML) with the introduction of Graph Neural Networks (GNNs) have led to a widespread interest in applying these approaches to business applications at scale. GNNs enable differentiable end-to-end (E2E) learning of model parameters given graph structure which enables optimization towards popular node, edge (link) and graph-level tasks. While the research innovation in new GNN layers and training strategies has been rapid, industrial adoption and utility of GNNs has lagged considerably due to the unique scale challenges that large-scale graph ML problems create. In this work, we share our approach to training, inference, and utilization of GNNs at Snapchat. To this end, we present GiGL (Gigantic Graph Learning), an open-source library to enable large-scale distributed graph ML to the benefit of researchers, ML engineers, and practitioners. We use GiGL internally at Snapchat to manage the heavy lifting of GNN workflows, including graph data preprocessing from relational DBs, subgraph sampling, distributed training, inference, and orchestration. GiGL is designed to interface cleanly with open-source GNN modeling libraries prominent in academia like PyTorch Geometric (PyG), while handling scaling and productionization challenges that make it easier for internal practitioners to focus on modeling. GiGL is used in multiple production settings, and has powered over 35 launches across multiple business domains in the last 2 years in the contexts of friend recommendation, content recommendation and advertising. This work details high-level design and tools the library provides, scaling properties, case studies in diverse business settings with industry-scale graphs, and several key lessons learned in employing graph ML at scale on large social data. GiGL is open-sourced at https://github.com/snap-research/GiGL.
Problem

Research questions and friction points this paper is trying to address.

Scalable Graph Neural Networks for industry applications
Enabling large-scale distributed graph machine learning
Addressing scaling and productionization challenges in GNNs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale distributed graph ML
GNN workflows management
Integration with PyTorch Geometric
🔎 Similar Papers
No similar papers found.
T
Tong Zhao
Snap Inc., USA
Yozen Liu
Yozen Liu
Snap Inc.
M
Matthew Kolodner
Snap Inc., USA
K
Kyle Montemayor
Snap Inc., USA
E
Elham Ghazizadeh
Snap Inc., USA
A
Ankit Batra
Snap Inc., USA
Z
Zihao Fan
Snap Inc., USA
X
Xiaobin Gao
Snap Inc., USA
X
Xuan Guo
Snap Inc., USA
J
Jiwen Ren
Snap Inc., USA
S
Serim Park
Snap Inc., USA
P
Peicheng Yu
Snap Inc., USA
J
Jun Yu
Snap Inc., USA
S
Shubham Vij
Snap Inc., USA
N
Neil Shah
Snap Inc., USA