๐ค AI Summary
Efficiently estimating hitting times between arbitrary node pairs in large-scale graphs remains challenging due to prohibitive global computation costs.
Method: This paper proposes a truncated double random walk algorithm requiring only local neighborhood access. It is the first to adapt spectral truncation techniques to the asymmetric hitting time setting, establishing a theoretical connection between hitting time estimation and distributional discrepancy testing. By integrating Kronecker graph modeling, local sampling, and Chernoff bounds for Markov chains, the method achieves approximation with rigorous error guarantees.
Contribution/Results: Theoretically, we derive tight asymptotic upper and lower bounds and explicitly characterize the biasโtruncation-step relationship. Empirically, the algorithm achieves high accuracy and linear scalability on both real-world and synthetic graphs, significantly outperforming global methods. It is broadly applicable to network analysis, recommender systems, and diffusion modeling.
๐ Abstract
Hitting times provide a fundamental measure of distance in random processes, quantifying the expected number of steps for a random walk starting at node $u$ to reach node $v$. They have broad applications across domains such as network centrality analysis, ranking and recommendation systems, and epidemiology. In this work, we develop local algorithms for estimating hitting times between a pair of vertices $u,v$ without accessing the full graph, overcoming scalability issues of prior global methods. Our first algorithm uses the key insight that hitting time computations can be truncated at the meeting time of two independent random walks from $u$ and $v$. This leads to an efficient estimator analyzed via the Kronecker product graph and Markov Chain Chernoff bounds. We also present an algorithm extending the work of [Peng et al.; KDD 2021], that introduces a novel adaptation of the spectral cutoff technique to account for the asymmetry of hitting times. This adaptation captures the directionality of the underlying random walk and requires non-trivial modifications to ensure accuracy and efficiency. In addition to the algorithmic upper bounds, we also provide tight asymptotic lower bounds. We also reveal a connection between hitting time estimation and distribution testing, and validate our algorithms using experiments on both real and synthetic data.