No-Regret Gaussian Process Optimization of Time-Varying Functions

📅 2025-11-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
We address the no-regret optimization of time-varying black-box functions under pure bandit feedback. We first show that standard GP-bandit algorithms fail to achieve no-regret in dynamic environments. To overcome this, we propose W-SparQ-GP-UCB—the first algorithm to incorporate uncertainty injection into time-varying Gaussian process (GP) optimization. It jointly employs heteroscedastic GP modeling, sparse inference, and RKHS norm constraints to explicitly characterize function non-stationarity. By re-querying historical points and adaptively updating the model, it attains no-regret with only asymptotically minimal additional queries. Theoretically, we establish the first lower bound on the minimum extra query overhead required for no-regret, quantify the fundamental trade-off between temporal variation rate and achievable regret rate, and provide matching upper and lower bounds—thereby fully characterizing the statistical limits of no-regret learning for time-varying GPs under bandit feedback.

Technology Category

Application Category

📝 Abstract
Sequential optimization of black-box functions from noisy evaluations has been widely studied, with Gaussian Process bandit algorithms such as GP-UCB guaranteeing no-regret in stationary settings. However, for time-varying objectives, it is known that no-regret is unattainable under pure bandit feedback unless strong and often unrealistic assumptions are imposed. In this article, we propose a novel method to optimize time-varying rewards in the frequentist setting, where the objective has bounded RKHS norm. Time variations are captured through uncertainty injection (UI), which enables heteroscedastic GP regression that adapts past observations to the current time step. As no-regret is unattainable in general in the strict bandit setting, we relax the latter allowing additional queries on previously observed points. Building on sparse inference and the effect of UI on regret, we propose extbf{W-SparQ-GP-UCB}, an online algorithm that achieves no-regret with only a vanishing number of additional queries per iteration. To assess the theoretical limits of this approach, we establish a lower bound on the number of additional queries required for no-regret, proving the efficiency of our method. Finally, we provide a comprehensive analysis linking the degree of time-variation of the function to achievable regret rates, together with upper and lower bounds on the number of additional queries needed in each regime.
Problem

Research questions and friction points this paper is trying to address.

Optimizes time-varying black-box functions with Gaussian processes
Achieves no-regret via uncertainty injection and additional queries
Establishes bounds linking time-variation to regret and query efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses uncertainty injection for heteroscedastic GP regression
Relaxes bandit setting with vanishing additional queries
Proposes W-SparQ-GP-UCB algorithm for no-regret optimization
🔎 Similar Papers
No similar papers found.
E
Eliabelle Mauduit
Unité de Mathématiques Appliquées, ENSTA, Institut Polytechnique de Paris, 91120 Palaiseau, France
E
Eloïse Berthier
U2IS, ENSTA, Institut Polytechnique de Paris, 91120 Palaiseau, France
Andrea Simonetto
Andrea Simonetto
ENSTA
Convex OptimizationControlCyber-physical SystemsSignal ProcessingQuantum Computing