Robust, randomized preconditioning for kernel ridge regression

📅 2023-04-24
🏛️ arXiv.org
📈 Citations: 16
Influential: 1
📄 PDF
🤖 AI Summary
This work addresses large-scale kernel ridge regression (KRR) problems with $N = 10^4$–$10^7$ samples. We propose two efficient randomized preconditioning methods: RPCholesky and KRILL. RPCholesky combines random projections with approximate Cholesky decomposition to enable exact full-data KRR solutions at $O(N^2)$ complexity. KRILL integrates low-rank kernel approximation, spectral decay exploitation, and data-adaptive center selection, achieving $O((N + k^2)k log k)$ complexity under $k ll N$ inducing points. Both methods substantially accelerate iterative solver convergence and improve numerical robustness. Experiments across multiple benchmarks demonstrate that our approaches simultaneously achieve high accuracy and strong scalability—enabling, for the first time, efficient and stable deployment of KRR on million-scale datasets.
📝 Abstract
This paper investigates two randomized preconditioning techniques for solving kernel ridge regression (KRR) problems with a medium to large number of data points ($10^4 leq N leq 10^7$), and it introduces two new methods with state-of-the-art performance. The first method, RPCholesky preconditioning, accurately solves the full-data KRR problem in $O(N^2)$ arithmetic operations, assuming sufficiently rapid polynomial decay of the kernel matrix eigenvalues. The second method, KRILL preconditioning, offers an accurate solution to a restricted version of the KRR problem involving $k ll N$ selected data centers at a cost of $O((N + k^2) k log k)$ operations. The proposed methods solve a broad range of KRR problems, making them ideal for practical applications.
Problem

Research questions and friction points this paper is trying to address.

Solving kernel ridge regression for large datasets efficiently
Developing randomized preconditioning methods for KRR
Reducing computational costs in medium to large-scale KRR
Innovation

Methods, ideas, or system contributions that make the work stand out.

RPCholesky preconditioning solves full-data KRR in O(N^2) operations
KRILL preconditioning uses k selected centers for restricted KRR
Both methods employ randomized preconditioning for efficient computation
🔎 Similar Papers
No similar papers found.
M
M. D'iaz
Department of Applied Mathematics & Statistics, Johns Hopkins University, Baltimore, MD
Ethan N. Epperly
Ethan N. Epperly
Miller research fellow, UC Berkeley
Randomized AlgorithmsMathematics of Data ScienceMatrix ComputationsQuantum Algorithms
Zachary Frangella
Zachary Frangella
Stanford University
Machine LearningOptimizationNumerical Linear Algebra
J
J. Tropp
Computing & Mathematical Sciences, California Institute of Technology, Pasadena, CA
R
R. Webber
Department of Mathematics, University of California San Diego, La Jolla, CA