🤖 AI Summary
In high-dimensional Bayesian optimization (HBO), simple methods often outperform sophisticated ones due to gradient vanishing in Gaussian processes (GPs) caused by poor initialization—undermining global surrogate modeling—while local search strategies better suit sparse, high-dimensional response landscapes.
Method: We propose MSR (Maximum-Likelihood-based Scale-adaptive Refinement), a length-scale adaptation strategy grounded in maximum likelihood estimation that explicitly enhances local exploration capability to mitigate gradient vanishing.
Contribution/Results: Through rigorous theoretical analysis and targeted ablation experiments, we validate MSR’s mechanism. Evaluated on multiple real-world black-box optimization benchmarks, MSR consistently achieves state-of-the-art (SOTA) performance, significantly surpassing existing HBO methods. Our work establishes a new paradigm for high-dimensional Bayesian optimization and delivers a principled, practical solution.
📝 Abstract
Recent work reported that simple Bayesian optimization methods perform well for high-dimensional real-world tasks, seemingly contradicting prior work and tribal knowledge. This paper investigates the 'why'. We identify fundamental challenges that arise in high-dimensional Bayesian optimization and explain why recent methods succeed. Our analysis shows that vanishing gradients caused by Gaussian process initialization schemes play a major role in the failures of high-dimensional Bayesian optimization and that methods that promote local search behaviors are better suited for the task. We find that maximum likelihood estimation of Gaussian process length scales suffices for state-of-the-art performance. Based on this, we propose a simple variant of maximum likelihood estimation called MSR that leverages these findings to achieve state-of-the-art performance on a comprehensive set of real-world applications. We also present targeted experiments to illustrate and confirm our findings.