Perfect $L_p$ Sampling with Polylogarithmic Update Time

๐Ÿ“… 2025-11-29
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
For perfect $L_p$ sampling ($0 < p < 2$) in the turnstile streaming modelโ€”i.e., sampling coordinate $i$ with exact probability $|x_i|^p / |x|_p^p$, up to error $leq n^{-C}$โ€”existing algorithms suffer from superpolynomial update time $Omega(n^C)$. This work presents the first perfect $L_p$ sampler achieving both optimal space complexity $ ilde{O}(log^2 n)$ and $mathrm{poly}(log n)$ update time. Our method leverages characteristic function simulation of the reciprocal power of truncated exponential random variables, combined with the Gil-Pelaez inversion formula and an improved trapezoidal integration scheme, to efficiently approximate the cumulative distribution function. This approach breaks a long-standing update-time bottleneck, enabling practical deployment in streaming spectral analysis, sparse recovery, and related applications.

Technology Category

Application Category

๐Ÿ“ Abstract
Perfect $L_p$ sampling in a stream was introduced by Jayaram and Woodruff (FOCS 2018) as a streaming primitive which, given turnstile updates to a vector $x in {- ext{poly}(n), ldots, ext{poly}(n)}^n$, outputs an index $i^* in {1, 2, ldots, n}$ such that the probability of returning index $i$ is exactly [Pr[i^* = i] = frac{|x_i|^p}{|x|_p^p} pm frac{1}{n^C},] where $C > 0$ is an arbitrarily large constant. Jayaram and Woodruff achieved the optimal $ ilde{O}(log^2 n)$ bits of memory for $0 < p < 2$, but their update time is at least $n^C$ per stream update. Thus an important open question is to achieve efficient update time while maintaining optimal space. For $0 < p < 2$, we give the first perfect $L_p$-sampler with the same optimal amount of memory but with only $ ext{poly}(log n)$ update time. Crucial to our result is an efficient simulation of a sum of reciprocals of powers of truncated exponential random variables by approximating its characteristic function, using the Gil-Pelaez inversion formula, and applying variants of the trapezoid formula to quickly approximate it.
Problem

Research questions and friction points this paper is trying to address.

Achieves perfect Lp sampling with polylogarithmic update time
Improves prior work with exponential update time to poly(log n)
Simulates sum of reciprocals using characteristic function approximation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Perfect Lp sampling with polylogarithmic update time
Simulating sum of reciprocals using characteristic function approximation
Applying Gil-Pelaez inversion and trapezoid formula variants
๐Ÿ”Ž Similar Papers
No similar papers found.