🤖 AI Summary
This work addresses the lack of high-probability regret bounds for Gaussian Process Thompson Sampling (GP-TS) in Bayesian optimization and the unclear dependence of the failure probability δ on the time horizon T. Under the assumption that the objective function is a sample path from a Gaussian process, the paper establishes the first polynomial-in-δ lower bound on the regret of GP-TS and derives an improved upper bound on cumulative regret. By relaxing conditions used in GP-UCB analyses and introducing refined probabilistic arguments alongside auxiliary lemmas, the authors provide several rigorous theoretical guarantees, including bounds on second moments and a relaxed expected regret bound. The results demonstrate that GP-TS exhibits superior dependence on both δ and T compared to prior analyses, thereby filling a critical theoretical gap and offering new foundations for Bayesian optimization.
📝 Abstract
We study a widely used Bayesian optimization method, Gaussian process Thompson sampling (GP-TS), under the assumption that the objective function is a sample path from a GP. Compared with the GP upper confidence bound (GP-UCB) with established high-probability and expected regret bounds, most analyses of GP-TS have been limited to expected regret. Moreover, whether the recent analyses of GP-UCB for the lenient regret and the improved cumulative regret upper bound can be applied to GP-TS remains unclear. To fill these gaps, this paper shows several regret bounds: (i) a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on $1/δ$ with probability $δ$, (ii) an upper bound of the second moment of cumulative regret, which directly suggests an improved regret upper bound on $δ$, (iii) expected lenient regret upper bounds, and (iv) an improved cumulative regret upper bound on the time horizon $T$. Along the way, we provide several useful lemmas, including a relaxation of the necessary condition from recent analysis to obtain improved regret upper bounds on $T$.