RidgeWalker: Perfectly Pipelined Graph Random Walks on FPGAs

📅 2026-01-16

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

Graph random walks (GRWs) are notoriously challenging to accelerate efficiently in hardware due to their strong data dependency, irregular memory access patterns, and execution imbalance. This work presents the first realization of a fully pipelined GRW accelerator, overcoming the limitations of conventional static scheduling through stateless fine-grained task decomposition, an asynchronous pipeline architecture, and a queueing-theory-based feedback-driven dynamic scheduler that enables out-of-order execution and adaptive load balancing. Evaluated on real-world graph datasets, the proposed system achieves an average speedup of 7.0× over existing FPGA implementations and 8.1× over GPU baselines, with peak accelerations reaching 71.0× and 22.9×, respectively.

Technology Category

Application Category

📝 Abstract

Graph Random Walks (GRWs) offer efficient approximations of key graph properties and have been widely adopted in many applications. However, GRW workloads are notoriously difficult to accelerate due to their strong data dependencies, irregular memory access patterns, and imbalanced execution behavior. While recent work explores FPGA-based accelerators for GRWs, existing solutions fall far short of hardware potential due to inefficient pipelining and static scheduling. This paper presents RidgeWalker, a high-performance GRW accelerator designed for datacenter FPGAs. The key insight behind RidgeWalker is that the Markov property of GRWs allows decomposition into stateless, fine-grained tasks that can be executed out-of-order without compromising correctness. Building on this, RidgeWalker introduces an asynchronous pipeline architecture with a feedback-driven scheduler grounded in queuing theory, enabling perfect pipelining and adaptive load balancing. We prototype RidgeWalker on datacenter FPGAs and evaluated it across a range of GRW algorithms and real-world graph datasets. Experimental results demonstrate that RidgeWalker achieves an average speedup of 7.0x over state-of-the-art FPGA solutions and 8.1x over GPU solutions, with peak speedups of up to 71.0x and 22.9x, respectively. The source code is publicly available at https://github.com/Xtra-Computing/RidgeWalker.

Problem

Research questions and friction points this paper is trying to address.

Graph Random Walks

FPGA acceleration

data dependencies

irregular memory access

pipeline inefficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Random Walks

FPGA acceleration

asynchronous pipeline