🤖 AI Summary
To address the challenge of simultaneously achieving low-latency streaming updates, high-throughput batch updates, and low memory overhead in random walk generation on dynamic graphs, this paper proposes the first efficient random walk engine specifically designed for dynamic graphs. Our method introduces three key innovations: (1) a cardinality-based bias decomposition algorithm enabling constant-time sampling; (2) a group-adaptive memory compression mechanism that significantly reduces storage redundancy; and (3) a GPU-aware parallel update architecture integrating CUDA acceleration with dynamic graph index optimization. Evaluated on diverse real-world dynamic graph datasets, our engine achieves up to 271.11× faster sampling than state-of-the-art methods, supports millisecond-level streaming updates, scales to terabyte-scale graphs, and substantially reduces memory consumption.
📝 Abstract
Random walks are a primary means for extracting information from large-scale graphs. While most real-world graphs are inherently dynamic, state-of-the-art random walk engines failed to efficiently support such a critical use case. This paper takes the initiative to build a general random walk engine for dynamically changing graphs with two key principles: (i) This system should support both low-latency streaming updates and high-throughput batched updates. (ii) This system should achieve fast sampling speed while maintaining acceptable space consumption to support dynamic graph updates. Upholding both standards, we introduce Bingo, a GPU-based random walk engine for dynamically changing graphs. First, we propose a novel radix-based bias factorization algorithm to support constant time sampling complexity while supporting fast streaming updates. Second, we present a group-adaption design to reduce space consumption dramatically. Third, we incorporate GPU-aware designs to support high-throughput batched graph updates on massively parallel platforms. Together, Bingo outperforms existing efforts across various applications, settings, and datasets, achieving up to a 271.11x speedup compared to the state-of-the-art efforts.