Streaming Graph Algorithms in the Massively Parallel Computation Model

📅 2024-06-17

🏛️ ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

This work addresses dynamic graph processing in the Massively Parallel Computation (MPC) model. We propose the first algorithmic framework supporting batch edge insertions and deletions, terminating in a constant number of parallel rounds, and requiring strongly sublinear local memory per machine. Under the constraint that total memory is sublinear—i.e., significantly smaller than the graph size—we achieve, for the first time, efficient dynamic maintenance of fundamental graph problems, including connectivity, minimum spanning forest, and approximate maximum matching. In contrast to prior approaches relying on linear total memory, our framework attains asymptotic optimality simultaneously in round complexity, local memory usage, and total memory consumption. This breakthrough substantially improves processing efficiency and scalability for massive graphs under resource-constrained environments.

Technology Category

Application Category

📝 Abstract

We initiate the study of graph algorithms in the streaming setting on massive distributed and parallel systems inspired by practical data processing systems. The objective is to design algorithms that can efficiently process evolving graphs via large batches of edge insertions and deletions using as little memory as possible. We focus on the nowadays canonical model for the study of theoretical algorithms for massive networks, the Massively Parallel Computation (MPC) model. We design MPC algorithms that efficiently process evolving graphs: in a constant number of rounds they can handle large batches of edge updates for problems such as connectivity, minimum spanning forest, and approximate matching while adhering to the most restrictive memory regime, in which the local memory per machine is strongly sublinear in the number of vertices and the total memory is sublinear in the graph size. These results improve upon earlier works in this area which rely on using larger total space, proportional to the size of the processed graph. Our work demonstrates that parallel algorithms can process dynamically changing graphs with asymptotically optimal utilization of MPC resources: parallel time, local memory, and total memory, while processing large batches of edge updates.

Problem

Research questions and friction points this paper is trying to address.

MPC Model

Graph Algorithms

Memory Efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

MPC model

Efficient graph processing

Low memory usage

🔎 Similar Papers

CATGNN: Cost-Efficient and Scalable Distributed Training for Graph Neural Networks

2024-04-02arXiv.orgCitations: 1

💼 Related Jobs

Research Scientist