🤖 AI Summary
To address the slow convergence and high computational cost of Bayesian optimization (BO) in high-dimensional expensive black-box optimization, this paper proposes NeST-BO. Methodologically, NeST-BO introduces a novel acquisition function based on a forward-one-step bound of the Newton step error, enabling local quadratic convergence. It employs a Gaussian process that jointly models function values, first-order gradients, and second-order Hessians—explicitly incorporating curvature information. To mitigate the curse of dimensionality, NeST-BO performs gradient and Hessian inference and sampling within a low-dimensional subspace (e.g., random or sparsely learned subspaces), reducing curvature learning complexity from $O(d^2)$ to $O(m^2)$, where $m ll d$. Empirically, NeST-BO achieves significant improvements over state-of-the-art high-dimensional and local BO methods on synthetic and real-world benchmarks with up to 1,000 dimensions, demonstrating faster convergence and lower cumulative regret.
📝 Abstract
Bayesian optimization (BO) is effective for expensive black-box problems but remains challenging in high dimensions. We propose NeST-BO, a local BO method that targets the Newton step by jointly learning gradient and Hessian information with Gaussian process surrogates, and selecting evaluations via a one-step lookahead bound on Newton-step error. We show that this bound (and hence the step error) contracts with batch size, so NeST-BO directly inherits inexact-Newton convergence: global progress under mild stability assumptions and quadratic local rates once steps are sufficiently accurate. To scale, we optimize the acquisition in low-dimensional subspaces (e.g., random embeddings or learned sparse subspaces), reducing the dominant cost of learning curvature from $O(d^2)$ to $O(m^2)$ with $m ll d$ while preserving step targeting. Across high-dimensional synthetic and real-world problems, including cases with thousands of variables and unknown active subspaces, NeST-BO consistently yields faster convergence and lower regret than state-of-the-art local and high-dimensional BO baselines.