Analysis of Nystrom method with sequential ridge leverage scores

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Large-scale kernel ridge regression is hindered by the prohibitive storage cost of the full kernel matrix, necessitating efficient approximation methods. This work proposes INK-ESTIMATE, an algorithm that, in a streaming setting, achieves the first single-pass incremental estimation of ridge leverage scores without revisiting historical data, maintaining only a fixed-size sketch of the kernel matrix. By integrating Nyström approximation with kernel sketching techniques, the method provides theoretical guarantees on both approximation error and statistical risk at any intermediate step. Empirical results demonstrate that the algorithm effectively approximates the kernel matrix under constant, modest memory constraints while preserving high accuracy and strong statistical performance of the resulting kernel ridge regression solution.

Technology Category

Application Category

📝 Abstract

Large-scale kernel ridge regression (KRR) is limited by the need to store a large kernel matrix K_t. To avoid storing the entire matrix K_t, Nystrom methods subsample a subset of columns of the kernel matrix, and efficiently find an approximate KRR solution on the reconstructed matrix. The chosen subsampling distribution in turn affects the statistical and computational tradeoffs. For KRR problems, recent works show that a sampling distribution proportional to the ridge leverage scores (RLSs) provides strong reconstruction guarantees for the approximation. While exact RLSs are as difficult to compute as a KRR solution, we may be able to approximate them well enough. In this paper, we study KRR problems in a sequential setting and introduce the INK-ESTIMATE algorithm, that incrementally computes the RLSs estimates. INK-ESTIMATE maintains a small sketch of K_t, that at each step is used to compute an intermediate estimate of the RLSs. First, our sketch update does not require access to previously seen columns, and therefore a single pass over the kernel matrix is sufficient. Second, the algorithm requires a fixed, small space budget to run dependent only on the effective dimension of the kernel matrix. Finally, our sketch provides strong approximation guarantees on the distance between the true kernel matrix and its approximation, and on the statistical risk of the approximate KRR solution at any time, because all our guarantees hold at any intermediate step.

Problem

Research questions and friction points this paper is trying to address.

kernel ridge regression

Nystrom method

ridge leverage scores

sequential setting

large-scale kernel matrix

Innovation

Methods, ideas, or system contributions that make the work stand out.

Nystrom method

ridge leverage scores

kernel ridge regression