🤖 AI Summary
Bayesian optimization (BO) for combinatorial domains suffers from a lack of unified theoretical foundations for kernel design. To address this, we propose a principled framework based on the heat kernel, establishing for the first time a systematic connection between combinatorial kernels and graph heat kernels. We prove that the resulting kernel is insensitive to optimal solution location and exhibits strong structural invariance. By integrating graph-theoretic principles with diffusion process modeling, we derive compact closed-form heat kernels applicable to diverse combinatorial structures—including sequences, trees, and graphs—and seamlessly embed them into standard BO pipelines. Theoretical analysis reveals that several existing combinatorial kernels are special cases of our heat kernel formulation. Experiments on benchmark tasks—including materials discovery and neural architecture search—demonstrate state-of-the-art performance, significantly outperforming both complex and computationally intensive baselines.
📝 Abstract
Bayesian Optimization (BO) has the potential to solve various combinatorial tasks, ranging from materials science to neural architecture search. However, BO requires specialized kernels to effectively model combinatorial domains. Recent efforts have introduced several combinatorial kernels, but the relationships among them are not well understood. To bridge this gap, we develop a unifying framework based on heat kernels, which we derive in a systematic way and express as simple closed-form expressions. Using this framework, we prove that many successful combinatorial kernels are either related or equivalent to heat kernels, and validate this theoretical claim in our experiments. Moreover, our analysis confirms and extends the results presented in Bounce: certain algorithms' performance decreases substantially when the unknown optima of the function do not have a certain structure. In contrast, heat kernels are not sensitive to the location of the optima. Lastly, we show that a fast and simple pipeline, relying on heat kernels, is able to achieve state-of-the-art results, matching or even outperforming certain slow or complex algorithms.