🤖 AI Summary
This work addresses large-scale kernel ridge regression (KRR) problems with $N = 10^4$–$10^7$ samples. We propose two efficient randomized preconditioning methods: RPCholesky and KRILL. RPCholesky combines random projections with approximate Cholesky decomposition to enable exact full-data KRR solutions at $O(N^2)$ complexity. KRILL integrates low-rank kernel approximation, spectral decay exploitation, and data-adaptive center selection, achieving $O((N + k^2)k log k)$ complexity under $k ll N$ inducing points. Both methods substantially accelerate iterative solver convergence and improve numerical robustness. Experiments across multiple benchmarks demonstrate that our approaches simultaneously achieve high accuracy and strong scalability—enabling, for the first time, efficient and stable deployment of KRR on million-scale datasets.
📝 Abstract
This paper investigates two randomized preconditioning techniques for solving kernel ridge regression (KRR) problems with a medium to large number of data points ($10^4 leq N leq 10^7$), and it introduces two new methods with state-of-the-art performance. The first method, RPCholesky preconditioning, accurately solves the full-data KRR problem in $O(N^2)$ arithmetic operations, assuming sufficiently rapid polynomial decay of the kernel matrix eigenvalues. The second method, KRILL preconditioning, offers an accurate solution to a restricted version of the KRR problem involving $k ll N$ selected data centers at a cost of $O((N + k^2) k log k)$ operations. The proposed methods solve a broad range of KRR problems, making them ideal for practical applications.