🤖 AI Summary
This work addresses the limitation of Random Fourier Features (RFF) in approximating non-Gaussian isotropic shift-invariant kernels. We propose a unified scale-mixture representation of spectral distributions: for the first time, we rigorously prove that the spectral distribution of any isotropic positive-definite shift-invariant kernel admits a representation as an α-stable random vector scaled by a scalar factor, whose distribution is analytically determined by the kernel function. Building on this characterization, we derive a general spectral sampling formula applicable to a broad class of kernels—including exponential-power, generalized Matérn, generalized Cauchy, and Beta/Kummer/Tricomi families—without modifying existing Gaussian RFF implementations. The approach seamlessly extends RFF-based algorithms such as SVM, kernel ridge regression, and Gaussian processes to diverse kernel families, significantly enhancing their compatibility and flexibility while preserving computational efficiency.
📝 Abstract
Rahimi and Recht (2007) introduced the idea of decomposing positive definite shift-invariant kernels by randomly sampling from their spectral distribution. This famous technique, known as Random Fourier Features (RFF), is in principle applicable to any such kernel whose spectral distribution can be identified and simulated. In practice, however, it is usually applied to the Gaussian kernel because of its simplicity, since its spectral distribution is also Gaussian. Clearly, simple spectral sampling formulas would be desirable for broader classes of kernels. In this paper, we prove that the spectral distribution of every positive definite isotropic kernel can be decomposed as a scale mixture of $alpha$-stable random vectors, and we identify the scaling distribution as a function of the kernel. This constructive decomposition provides a simple and ready-to-use spectral sampling formula for every multivariate positive definite shift-invariant kernel, including exponential power kernels, generalized Mat'ern kernels, generalized Cauchy kernels, as well as newly introduced kernels such as the Beta, Kummer, and Tricomi kernels. In particular, we show that the spectral distributions of these kernels are scale mixtures of the multivariate Gaussian distribution. This provides a very simple way to adapt existing random Fourier features software based on Gaussian kernels to any positive definite shift-invariant kernel. This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques for which the random Fourier features technique is applicable.