🤖 AI Summary
This paper studies the computational complexity of locally estimating the PageRank centrality of a single node in directed graphs, focusing on closing the gap between upper and lower bounds under the assumption of bounded in-degree (Δ_in). We propose a novel algorithm based on stochastic reverse propagation and Monte Carlo sampling: starting from the target node, it selectively traces edges backward along the reverse graph and performs weighted sampling, accessing only a local subgraph. Theoretically, its time complexity is O(1/ε² ⋅ log(1/δ) ⋅ (1 + αΔ_in)/(1 − α)²), matching the state-of-the-art lower bound up to logarithmic factors—achieving optimality for small Δ_in for the first time. This result fills a fundamental gap in the in-degree-parameterized complexity theory of PageRank estimation and significantly improves upon prior algorithms whose complexities depend on the total number of nodes or on out-degree.
📝 Abstract
We study the computational complexity of locally estimating a node's PageRank centrality in a directed graph $G$. For any node $t$, its PageRank centrality $π(t)$ is defined as the probability that a random walk in $G$, starting from a uniformly chosen node, terminates at $t$, where each step terminates with a constant probability $αin(0,1)$.
To obtain a multiplicative $ig(1pm O(1)ig)$-approximation of $π(t)$ with probability $Ω(1)$, the previously best upper bound is $O(n^{1/2}min{ Δ_{in}^{1/2},Δ_{out}^{1/2},m^{1/4}})$ from [Wang, Wei, Wen, Yang STOC '24], where $n$ and $m$ denote the number of nodes and edges in $G$, and $Δ_{in}$ and $Δ_{out}$ upper bound the in-degrees and out-degrees of $G$, respectively. The same paper implicitly gives the previously best lower bound of $Ω(n^{1/2}min{Δ_{in}^{1/2}/n^γ,Δ_{out}^{1/2}/n^γ,m^{1/4}})$, where $γ=frac{log(1/(1-α))}{4logΔ_{in}-2log(1/(1-α))}$ if $Δ_{in}>1/(1-α)$, and $γ=1/2$ if $Δ_{in}le1/(1-α)$. As $γ$ only depends on $Δ_{in}$, the known upper bound is tight if we only parameterize the complexity by $n$, $m$, and $Δ_{out}$. However, there remains a gap of $Ω(n^γ)$ when considering $Δ_{in}$, and this gap is large when $Δ_{in}$ is small. In the extreme case where $Δ_{in}le1/(1-α)$, we have $γ=1/2$, leading to a gap of $Ω(n^{1/2})$ between the bounds $O(n^{1/2})$ and $Ω(1)$.
In this paper, we present a new algorithm that achieves the above lower bound (up to logarithmic factors). The algorithm assumes that $n$ and the bounds $Δ_{in}$ and $Δ_{out}$ are known in advance. Our key technique is a novel randomized backwards propagation process which only propagates selectively based on Monte Carlo estimated PageRank scores.