🤖 AI Summary
To address the high computational cost and poor scalability of Random Walk Centrality (RWC) computation on large-scale networks, this paper formulates RWC estimation as a scalable numerical optimization problem—its first such formulation—and proposes two theoretically grounded, near-linear-time algorithms with provable approximation guarantees. Methodologically, it integrates graph-theoretic insights with numerical linear algebra: (1) it accelerates Laplacian system solving via approximate Cholesky decomposition and sparse inverse matrix estimation; and (2) it introduces root-oriented spanning tree Monte Carlo sampling to efficiently estimate expected hitting times. Evaluated on real-world networks with over ten million nodes, the approach achieves 10–100× speedup over state-of-the-art methods while maintaining high accuracy (relative error <5%). This substantially enhances the practicality and scalability of RWC computation on massive graphs.
📝 Abstract
Random walk centrality is a fundamental metric in graph mining for quantifying node importance and influence, defined as the weighted average of hitting times to a node from all other nodes. Despite its ability to capture rich graph structural information and its wide range of applications, computing this measure for large networks remains impractical due to the computational demands of existing methods. In this paper, we present a novel formulation of random walk centrality, underpinning two scalable algorithms: one leveraging approximate Cholesky factorization and sparse inverse estimation, while the other sampling rooted spanning trees. Both algorithms operate in near-linear time and provide strong approximation guarantees. Extensive experiments on large real-world networks, including one with over 10 million nodes, demonstrate the efficiency and approximation quality of the proposed algorithms.