🤖 AI Summary
This work addresses efficiency bottlenecks in solving large-scale linear systems and approximating matrix norms. We propose a multilevel randomized sketching preconditioned iterative method, integrating Nyström low-rank approximation, sparse random sketching, and multilevel preconditioning. It establishes the first multilevel sketched preconditioning framework grounded in the natural average condition number. Theoretical contributions include: (1) optimal complexity $ ilde{O}(n^2 + d_lambda^omega)$ for solving regularized linear systems; (2) accelerated complexity $ ilde{O}(n^{2.065} + k^omega)$ for systems with $k$ outlying singular values; and (3) Schatten-$p$ norm approximation—particularly the nuclear norm—at $ ilde{O}(n^{2.11})$, improving upon the prior best $ ilde{O}(n^{2.18})$. These advances significantly enhance computational efficiency for key subproblems in applications such as Gaussian process regression.
📝 Abstract
We present a new class of preconditioned iterative methods for solving linear systems of the form $Ax = b$. Our methods are based on constructing a low-rank Nystr""om approximation to $A$ using sparse random matrix sketching. This approximation is used to construct a preconditioner, which itself is inverted quickly using additional levels of random sketching and preconditioning. We prove that the convergence of our methods depends on a natural average condition number of $A$, which improves as the rank of the Nystr""om approximation increases. Concretely, this allows us to obtain faster runtimes for a number of fundamental linear algebraic problems: 1. We show how to solve any $n imes n$ linear system that is well-conditioned except for $k$ outlying large singular values in $ ilde{O}(n^{2.065} + k^omega)$ time, improving on a recent result of [Derezi'nski, Yang, STOC 2024] for all $k gtrsim n^{0.78}$. 2. We give the first $ ilde{O}(n^2 + {d_lambda}^{omega}$) time algorithm for solving a regularized linear system $(A + lambda I)x = b$, where $A$ is positive semidefinite with effective dimension $d_lambda=mathrm{tr}(A(A+lambda I)^{-1})$. This problem arises in applications like Gaussian process regression. 3. We give faster algorithms for approximating Schatten $p$-norms and other matrix norms. For example, for the Schatten 1-norm (nuclear norm), we give an algorithm that runs in $ ilde{O}(n^{2.11})$ time, improving on an $ ilde{O}(n^{2.18})$ method of [Musco et al., ITCS 2018]. All results are proven in the real RAM model of computation. Interestingly, previous state-of-the-art algorithms for most of the problems above relied on stochastic iterative methods, like stochastic coordinate and gradient descent. Our work takes a completely different approach, instead leveraging tools from matrix sketching.