Does block size matter in randomized block Krylov low-rank approximation?

📅 2025-08-08

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Existing theoretical analyses of randomized block Krylov methods for rank-$k$ matrix approximation yield tight bounds only for extreme block sizes $b = 1$ or $b = k$, whereas practical deployments commonly use intermediate block sizes $1 ll b ll k$ to improve efficiency—yet prior theory incurred an $O(k^2)$ overhead in this regime, starkly contradicting empirical performance. Method: We introduce a novel upper bound on the smallest singular value of the Krylov subspace projection and integrate ideas from accelerated sparse linear system solvers. Contribution/Results: We establish, for the first time, that for any $1 leq b leq k$, a $(1+varepsilon)$-accurate rank-$k$ approximation can be computed using $widetilde{O}(k/sqrt{varepsilon})$ matrix-vector multiplications—eliminating dependence on the $b(k-b)$ factor prevalent in prior analyses. This yields truly block-size-independent efficiency and matches optimal empirical performance for moderate $b$.

Technology Category

Application Category

📝 Abstract

We study the problem of computing a rank-$k$ approximation of a matrix using randomized block Krylov iteration. Prior work has shown that, for block size $b = 1$ or $b = k$, a $(1 + varepsilon)$-factor approximation to the best rank-$k$ approximation can be obtained after $ ilde O(k/sqrt{varepsilon})$ matrix-vector products with the target matrix. On the other hand, when $b$ is between $1$ and $k$, the best known bound on the number of matrix-vector products scales with $b(k-b)$, which could be as large as $O(k^2)$. Nevertheless, in practice, the performance of block Krylov methods is often optimized by choosing a block size $1 ll b ll k$. We resolve this theory-practice gap by proving that randomized block Krylov iteration produces a $(1 + varepsilon)$-factor approximate rank-$k$ approximation using $ ilde O(k/sqrt{varepsilon})$ matrix-vector products for any block size $1le ble k$. Our analysis relies on new bounds for the minimum singular value of a random block Krylov matrix, which may be of independent interest. Similar bounds are central to recent breakthroughs on faster algorithms for sparse linear systems [Peng & Vempala, SODA 2021; Nie, STOC 2022].

Problem

Research questions and friction points this paper is trying to address.

Determine optimal block size for Krylov low-rank approximation

Resolve theory-practice gap in block Krylov iteration efficiency

Prove matrix-vector product bounds for any block size

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses randomized block Krylov iteration

Achieves (1 + ε)-factor approximation

Works for any block size 1≤b≤k

🔎 Similar Papers

When big data actually are low-rank, or entrywise approximation of certain function-generated matrices