Large Dimensional Kernel Ridge Regression: Extending to Product Kernels

📅 2026-05-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

209K/year
🤖 AI Summary
This work investigates the universality of saturation effects and multiple descent phenomena in high-dimensional kernel ridge regression under broader kernel classes, particularly product kernels, thereby overcoming limitations of existing theory that relies on inner-product kernels or strong eigenfunction assumptions. By integrating tools from high-dimensional statistical learning theory, spectral analysis, and random matrix theory, the authors develop a unified non-asymptotic and asymptotic framework for analyzing generalization error under product kernels. Their main contributions include the first extension of minimax optimality, saturation behavior, periodic plateaus, and multiple descent structures to product kernels. They establish that minimax optimal rates are achieved when the source condition exponent \( s \leq 1 \), while saturation occurs for \( s > 1 \), and further reveal that the generalization error exhibits periodic plateaus and multiple descent patterns as a function of sample size.
📝 Abstract
Recent studies have reported $\textit{saturation effects}$ and $\textit{multiple descent behavior}$ in large dimensional kernel ridge regression (KRR). However, these findings are predominantly derived under restrictive settings, such as inner product kernels on sphere or strong eigenfunction assumptions like hypercontractivity. Whether such behaviors hold for other kernels remains an open question. In this paper, we establish a broad, new family of large dimensional kernels and derive the corresponding convergence rates of the generalization error. As a result, we recover key phenomena previously associated with inner product kernels on sphere, including: $i)$ the $\textit{minimax optimality}$ when the source condition $s\le 1$; $ii)$ the $\textit{saturation effect}$ when $s>1$; $iii)$ a $\textit{periodic plateau phenomenon}$ in the convergence rate and a $\textit {multiple-descent behavior}$ with respect to the sample size $n$.
Problem

Research questions and friction points this paper is trying to address.

kernel ridge regression
saturation effect
multiple descent
generalization error
product kernels
Innovation

Methods, ideas, or system contributions that make the work stand out.

kernel ridge regression
product kernels
saturation effect
multiple descent
generalization error