π€ AI Summary
Large-scale kernel ridge regression is hindered by the prohibitive storage cost of the full kernel matrix, necessitating efficient approximation methods. This work proposes INK-ESTIMATE, an algorithm that, in a streaming setting, achieves the first single-pass incremental estimation of ridge leverage scores without revisiting historical data, maintaining only a fixed-size sketch of the kernel matrix. By integrating NystrΓΆm approximation with kernel sketching techniques, the method provides theoretical guarantees on both approximation error and statistical risk at any intermediate step. Empirical results demonstrate that the algorithm effectively approximates the kernel matrix under constant, modest memory constraints while preserving high accuracy and strong statistical performance of the resulting kernel ridge regression solution.
π Abstract
Large-scale kernel ridge regression (KRR) is limited by the need to store a large kernel matrix K_t. To avoid storing the entire matrix K_t, Nystrom methods subsample a subset of columns of the kernel matrix, and efficiently find an approximate KRR solution on the reconstructed matrix. The chosen subsampling distribution in turn affects the statistical and computational tradeoffs. For KRR problems, recent works show that a sampling distribution proportional to the ridge leverage scores (RLSs) provides strong reconstruction guarantees for the approximation. While exact RLSs are as difficult to compute as a KRR solution, we may be able to approximate them well enough. In this paper, we study KRR problems in a sequential setting and introduce the INK-ESTIMATE algorithm, that incrementally computes the RLSs estimates. INK-ESTIMATE maintains a small sketch of K_t, that at each step is used to compute an intermediate estimate of the RLSs. First, our sketch update does not require access to previously seen columns, and therefore a single pass over the kernel matrix is sufficient. Second, the algorithm requires a fixed, small space budget to run dependent only on the effective dimension of the kernel matrix. Finally, our sketch provides strong approximation guarantees on the distance between the true kernel matrix and its approximation, and on the statistical risk of the approximate KRR solution at any time, because all our guarantees hold at any intermediate step.