🤖 AI Summary
This paper addresses two key bottlenecks in the initialization phase of Bayesian optimization: (1) conventional space-filling designs poorly reduce predictive uncertainty of surrogate models, and (2) they exhibit objective misalignment with hyperparameter learning. To resolve these issues, we propose HIPE—a novel active sampling strategy that for the first time integrates information-theoretic principles into initialization design. HIPE unifies hyperparameter learning and uncertainty reduction within a Gaussian process framework, yielding a closed-form acquisition function. Compared to state-of-the-art methods, HIPE significantly improves predictive accuracy, hyperparameter estimation quality, and downstream optimization efficiency—especially under low-sample, large-batch, few-iteration settings typical in practical deployment. Extensive experiments across multiple benchmark tasks demonstrate consistent superiority over Sobol sequences, Latin Hypercube Sampling (LHS), and entropy-based initialization approaches.
📝 Abstract
Bayesian Optimization is a widely used method for optimizing expensive black-box functions, relying on probabilistic surrogate models such as Gaussian Processes. The quality of the surrogate model is crucial for good optimization performance, especially in the few-shot setting where only a small number of batches of points can be evaluated. In this setting, the initialization plays a critical role in shaping the surrogate's predictive quality and guiding subsequent optimization. Despite this, practitioners typically rely on (quasi-)random designs to cover the input space. However, such approaches neglect two key factors: (a) space-filling designs may not be desirable to reduce predictive uncertainty, and (b) efficient hyperparameter learning during initialization is essential for high-quality prediction, which may conflict with space-filling designs. To address these limitations, we propose Hyperparameter-Informed Predictive Exploration (HIPE), a novel acquisition strategy that balances predictive uncertainty reduction with hyperparameter learning using information-theoretic principles. We derive a closed-form expression for HIPE in the Gaussian Process setting and demonstrate its effectiveness through extensive experiments in active learning and few-shot BO. Our results show that HIPE outperforms standard initialization strategies in terms of predictive accuracy, hyperparameter identification, and subsequent optimization performance, particularly in large-batch, few-shot settings relevant to many real-world Bayesian Optimization applications.