How Fast Can Graph Computations Go on Fine-grained Parallel Architectures

📅 2025-07-01

📈 Citations: 0

✨ Influential: 0

career value

254K/year

🤖 AI Summary

Graph computation suffers from performance bottlenecks on conventional parallel architectures due to the irregular memory access patterns and severe load imbalance inherent in real-world graphs. To address these challenges, this paper proposes UpDown, a fine-grained programmable architecture co-designed across hardware and software to optimize graph traversal and iterative computation. UpDown supports efficient execution of multiple variants of key graph algorithms—including PageRank and BFS—while preserving high programmability. Evaluated on RMAT-generated graphs with 33 million processing elements, UpDown achieves 637K GTEPS for PageRank and 989K GTEPS for BFS, representing 5× and 100× speedups over prior state-of-the-art, respectively. By fundamentally rethinking architectural support for irregular workloads, UpDown establishes a new paradigm for scalable, high-performance graph processing.

Technology Category

Application Category

📝 Abstract

Large-scale graph problems are of critical and growing importance and historically parallel architectures have provided little support. In the spirit of co-design, we explore the question, How fast can graph computing go on a fine-grained architecture? We explore the possibilities of an architecture optimized for fine-grained parallelism, natural programming, and the irregularity and skew found in real-world graphs. Using two graph benchmarks, PageRank (PR) and Breadth-First Search (BFS), we evaluate a Fine-Grained Graph architecture, UpDown, to explore what performance codesign can achieve. To demonstrate programmability, we wrote five variants of these algorithms. Simulations of up to 256 nodes (524,288 lanes) and projections to 16,384 nodes (33M lanes) show the UpDown system can achieve 637K GTEPS PR and 989K GTEPS BFS on RMAT, exceeding the best prior results by 5x and 100x respectively.

Problem

Research questions and friction points this paper is trying to address.

Optimizing graph computations for fine-grained parallel architectures

Enhancing performance of graph algorithms like PageRank and BFS

Exploring scalability of UpDown architecture on large-scale graphs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained parallel architecture for graph computing

UpDown system optimized for irregular graph processing

Achieves 5x-100x speedup on PageRank and BFS

🔎 Similar Papers

No similar papers found.