Estimating Random-Walk Probabilities in Directed Graphs

📅 2025-04-23

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This paper studies the efficient estimation of the termination probability π(s,t) of a decaying random walk from source s to target t in a directed graph, aiming to decide whether π(s,t) ≥ δ within constant relative error. We propose a unified algorithmic framework based on randomized sampling, stratified estimation, and query-complexity analysis, supporting adjacency-list queries, edge-existence tests, and sorted out-degree access. We establish the first tight worst-case bound Θ̃(min{m, 1/δ}) and tight average-case bound Θ̃(min{m, √(d/δ), 1/δ}), where m is the number of edges and d the average out-degree. Further, we introduce a dual-query model—combining standard edge queries with degree-aware sampling—which improves the average-case bound to Θ̃(min{m, √(d/δ), (1/δ)^{2/3}}). All upper bounds are proven optimal via matching lower bounds, thereby systematically characterizing the fundamental limits of query-model advantages for this problem.

Technology Category

Application Category

📝 Abstract

We study discounted random walks in a directed graph. In each vertex, the walk will either terminate with some probability $alpha$, or continue to a random out-neighbor. We are interested in the probability $pi(s,t)$ that such a random walk starting in $s$ ends in $t$. We wish to, with constant probability, estimate $pi(s, t)$ within a constant relative error, unless $pi(s, t)<delta$ for some given threshold $delta$. The current status is as follows. Algorithms with worst-case running time $ ilde O(m)$ and $O(1/delta)$ are known. A more complicated algorithm is known, which does not perform better in the worst case, but for the average running time over all $n$ possible targets $t$, it achieves an alternative bound of $O(sqrt{d/delta})$. All the above algorithms assume query access to the adjacency list of a node. On the lower bound side, the best-known lower bound for the worst case is $Omega(n^{1/2}m^{1/4})$ with $delta leq 1/(n^{1/2}m^{1/4})$, and for the average case it is $Omega(sqrt{n})$ with $delta leq 1/n$. This leaves substantial polynomial gaps in both cases. In this paper, we show that the above upper bounds are tight across all parameters $n$, $m$ and $delta$. We show that the right bound is $ ildeTheta(min{m, 1/delta})$ for the worst case, and $ ildeTheta(min{m, sqrt{d/delta}, 1/delta})$ for the average case. We also consider some additional graph queries from the literature. One allows checking whether there is an edge from $u$ to $v$ in constant time. Another allows access to the adjacency list of $u$ sorted by out-degree. We prove that none of these access queries help in the worst case, but if we have both of them, we get an average-case bound of $ ilde Theta(min{m,sqrt{d/delta}, (1/delta)^{2/3}})$.

Problem

Research questions and friction points this paper is trying to address.

Estimating random-walk probabilities in directed graphs

Determining tight bounds for worst-case and average-case running times

Evaluating impact of additional graph queries on performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Estimates random-walk probabilities in directed graphs

Uses worst-case and average-case time bounds

Analyzes impact of additional graph queries

🔎 Similar Papers

Maximum Likelihood Estimation on Stochastic Blockmodels for Directed Graph Clustering