How Fast Can Graph Computations Go on Fine-grained Parallel Architectures

📅 2025-07-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Graph computation suffers from performance bottlenecks on conventional parallel architectures due to the irregular memory access patterns and severe load imbalance inherent in real-world graphs. To address these challenges, this paper proposes UpDown, a fine-grained programmable architecture co-designed across hardware and software to optimize graph traversal and iterative computation. UpDown supports efficient execution of multiple variants of key graph algorithms—including PageRank and BFS—while preserving high programmability. Evaluated on RMAT-generated graphs with 33 million processing elements, UpDown achieves 637K GTEPS for PageRank and 989K GTEPS for BFS, representing 5× and 100× speedups over prior state-of-the-art, respectively. By fundamentally rethinking architectural support for irregular workloads, UpDown establishes a new paradigm for scalable, high-performance graph processing.

Technology Category

Application Category

📝 Abstract
Large-scale graph problems are of critical and growing importance and historically parallel architectures have provided little support. In the spirit of co-design, we explore the question, How fast can graph computing go on a fine-grained architecture? We explore the possibilities of an architecture optimized for fine-grained parallelism, natural programming, and the irregularity and skew found in real-world graphs. Using two graph benchmarks, PageRank (PR) and Breadth-First Search (BFS), we evaluate a Fine-Grained Graph architecture, UpDown, to explore what performance codesign can achieve. To demonstrate programmability, we wrote five variants of these algorithms. Simulations of up to 256 nodes (524,288 lanes) and projections to 16,384 nodes (33M lanes) show the UpDown system can achieve 637K GTEPS PR and 989K GTEPS BFS on RMAT, exceeding the best prior results by 5x and 100x respectively.
Problem

Research questions and friction points this paper is trying to address.

Optimizing graph computations for fine-grained parallel architectures
Enhancing performance of graph algorithms like PageRank and BFS
Exploring scalability of UpDown architecture on large-scale graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained parallel architecture for graph computing
UpDown system optimized for irregular graph processing
Achieves 5x-100x speedup on PageRank and BFS
🔎 Similar Papers
No similar papers found.
Y
Yuqing Wang
Department of Computer Science, University of Chicago
C
Charles Colley
Department of Computer Science, Purdue University
B
Brian Wheatman
Department of Computer Science, University of Chicago
J
Jiya Su
Department of Computer Science, University of Chicago
D
David F. Gleich
Department of Computer Science, Purdue University
Andrew A. Chien
Andrew A. Chien
William Eckhardt Distinguished Service Professor of Computer Science, University of Chicago
computer architecturehigh-performance computingcloudssustainabilitydata-intensive