🤖 AI Summary
This work addresses the scalability limitations of batch Bayesian optimization (BO) in high-dimensional, combinatorial, and discontinuous design spaces. We propose a generative batch optimization framework that directly employs a generative model as the acquisition function: a proposal distribution—proportional to the acquisition function—is constructed from utility values computed on observed data, bypassing explicit surrogate modeling. Our key contribution is the first unified integration of generative models for simultaneous density estimation, sampling, and acquisition function evaluation, enabling training with diverse reward signals and noisy utility labels. We provide theoretical guarantees showing asymptotic convergence of the generated sequence to the global optimum. Experiments demonstrate substantial improvements in efficiency and scalability across large-scale batch optimization tasks, while maintaining competitive approximation of the global optimum.
📝 Abstract
We present a general strategy for turning generative models into candidate solution samplers for batch Bayesian optimization (BO). The use of generative models for BO enables large batch scaling as generative sampling, optimization of non-continuous design spaces, and high-dimensional and combinatorial design. Inspired by the success of direct preference optimization (DPO), we show that one can train a generative model with noisy, simple utility values directly computed from observations to then form proposal distributions whose densities are proportional to the expected utility, i.e., BO's acquisition function values. Furthermore, this approach is generalizable beyond preference-based feedback to general types of reward signals and loss functions. This perspective avoids the construction of surrogate (regression or classification) models, common in previous methods that have used generative models for black-box optimization. Theoretically, we show that the generative models within the BO process approximately follow a sequence of distributions which asymptotically concentrate at the global optima under certain conditions. We also demonstrate this effect through experiments on challenging optimization problems involving large batches in high dimensions.