Sharper Guarantees for Misspecified Kernelized Bandit Optimization

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

252K/year

🤖 AI Summary

Existing kernelized weighted optimization theory exhibits excessive pessimism under model misspecification, as the misspecification error is significantly amplified by kernel complexity—typically quantified by the effective dimension γₙ. This work mitigates this issue through a localization strategy: in the offline setting, it leverages spectral Lebesgue constants to control approximation error, while in the online setting, it employs region partitioning to limit the global propagation of local misspecification. The paper achieves the first reduction of misspecification error amplification from √γₙ to logarithmic or polylogarithmic levels, substantially tightening theoretical guarantees. Specifically, it establishes a logarithmic amplification bound in the offline case and an online cumulative regret bound of Õ(√(γₙn) + nε), thereby eliminating the √γₙ factor previously afflicting the misspecification term.

📝 Abstract

Existing guarantees for misspecified kernelized bandit optimization pay for misspecification through kernel complexity: in generic offline bounds, the misspecification level $\varepsilon$ is multiplied by $\sqrt{d_\mathrm{eff}}$, where $d_\mathrm{eff}$ is the kernel effective dimension, while in online regret bounds, the corresponding penalty is $\sqrt{γ_n}\,n\varepsilon$, where $γ_n$ is the maximum information gain after $n$ rounds of interaction. In this work, we show that, for a large class of kernels, the misspecification amplification can be reduced to logarithmic or polylogarithmic growth. In the offline setting, we first prove high-probability simple-regret bounds whose misspecification term is governed by a spectral Lebesgue constant. This yields logarithmic amplification for one-dimensional monotone spectra and polylogarithmic amplification for multivariate Fourier-diagonal product kernels. In the online setting, we modify a domain-splitting algorithm and prove a cumulative regret bound of $\widetilde{\mathcal O}(\sqrt{γ_n n}+n\varepsilon)$ under mild localized eigendecay assumptions, removing the extra $\sqrt{γ_n}$ factor from the misspecification term. The common principle is localization: spectral localization controls the Lebesgue constant of the offline approximation operator, while domain splitting implements the spatial analogue of this mechanism in the online setting, preventing local misspecification errors from being amplified globally.

Problem

Research questions and friction points this paper is trying to address.

misspecified kernelized bandit

model misspecification

regret amplification

kernel complexity

effective dimension

Innovation

Methods, ideas, or system contributions that make the work stand out.

misspecification

localization

kernelized bandits