Batched Kernelized Bandits: Refinements and Extensions

📅 2026-03-12

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work investigates black-box optimization in a reproducing kernel Hilbert space (RKHS) under batched noisy feedback, covering both non-robust and adversarially robust settings. By precisely characterizing the optimal batch size—including constant factors—designing an adaptive batching schedule, and employing a minimax regret analysis, the study eliminates the extraneous factor $B$ present in existing regret bounds and establishes algorithm-independent lower bounds. The proposed robust-BPE algorithm is the first to achieve a tight cumulative regret bound in the robust setting. In the non-robust case, it attains a near-optimal regret bound, and both adaptive and fixed batching strategies are shown to share the same minimax regret rate.

Technology Category

Application Category

📝 Abstract

In this paper, we consider the problem of black-box optimization with noisy feedback revealed in batches, where the unknown function to optimize has a bounded norm in some Reproducing Kernel Hilbert Space (RKHS). We refer to this as the Batched Kernelized Bandits problem, and refine and extend existing results on regret bounds. For algorithmic upper bounds, (Li and Scarlett, 2022) shows that $B=O(\log\log T)$ batches suffice to attain near-optimal regret, where $T$ is the time horizon and $B$ is the number of batches. We further refine this by (i) finding the optimal number of batches including constant factors (to within $1+o(1)$), and (ii) removing a factor of $B$ in the regret bound. For algorithm-independent lower bounds, noticing that existing results only apply when the batch sizes are fixed in advance, we present novel lower bounds when the batch sizes are chosen adaptively, and show that adaptive batches have essentially same minimax regret scaling as fixed batches. Furthermore, we consider a robust setting where the goal is to choose points for which the function value remains high even after an adversarial perturbation. We present the robust-BPE algorithm, and show that a suitably-defined cumulative regret notion incurs the same bound as the non-robust setting, and derive a simple regret bound significantly below that of previous work.

Problem

Research questions and friction points this paper is trying to address.

Batched Kernelized Bandits

Black-box Optimization

Noisy Feedback

Reproducing Kernel Hilbert Space

Robust Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Batched Kernelized Bandits

Regret Bounds

Adaptive Batching