Relative Information Gain and Gaussian Process Regression

📅 2025-10-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates the sample complexity of function estimation and optimization in Gaussian process regression, focusing on the intrinsic relationship between information gain and effective dimensionality—and its sensitivity to observation noise. To address this, we introduce *relative information gain*, a novel metric that quantifies the robustness of information gain under noise perturbations and reveals its smooth interpolation behavior between information gain and effective dimensionality. This metric is naturally embedded in PAC-Bayesian excess risk bounds, enabling theoretical unification. Leveraging tools from reproducing kernel Hilbert space theory, spectral analysis, and information theory, we derive upper bounds on relative information gain that explicitly depend on the kernel’s spectral decay rate. Based on these bounds, we establish minimax-optimal convergence rates, substantially improving both the precision and generality of sample complexity analysis in nonparametric Bayesian learning.

Technology Category

Application Category

📝 Abstract
The sample complexity of estimating or maximising an unknown function in a reproducing kernel Hilbert space is known to be linked to both the effective dimension and the information gain associated with the kernel. While the information gain has an attractive information-theoretic interpretation, the effective dimension typically results in better rates. We introduce a new quantity called the relative information gain, which measures the sensitivity of the information gain with respect to the observation noise. We show that the relative information gain smoothly interpolates between the effective dimension and the information gain, and that the relative information gain has the same growth rate as the effective dimension. In the second half of the paper, we prove a new PAC-Bayesian excess risk bound for Gaussian process regression. The relative information gain arises naturally from the complexity term in this PAC-Bayesian bound. We prove bounds on the relative information gain that depend on the spectral properties of the kernel. When these upper bounds are combined with our excess risk bound, we obtain minimax-optimal rates of convergence.
Problem

Research questions and friction points this paper is trying to address.

Introduces relative information gain to measure kernel sensitivity to noise
Proves PAC-Bayesian risk bound using relative information gain complexity
Achieves minimax-optimal convergence rates through spectral kernel analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces relative information gain for kernel sensitivity
Proves PAC-Bayesian risk bound using Gaussian processes
Achieves minimax-optimal convergence rates via spectral analysis
🔎 Similar Papers
No similar papers found.