🤖 AI Summary
High-dimensional, non-convex scientific optimization problems are often prohibitively expensive to evaluate experimentally and suffer from human confirmation bias and outdated domain knowledge.
Method: We propose the first LLM-driven, context-aware Bayesian optimization (BO) framework that dynamically incorporates expert knowledge into the search process. It integrates hybrid stochastic reasoning, natural language explanation generation, and an active learning feedback mechanism—enabling the LLM to produce real-time, interpretable, and empirically verifiable search suggestions.
Contribution/Results: Evaluated on a 15-dimensional synthetic benchmark and four real-world scientific experiments, our framework significantly improves convergence speed and solution quality. It provides the first systematic empirical validation that large language models can perform reliable, interpretable causal reasoning in complex scientific optimization—bridging the gap between black-box optimization and domain-informed, human-aligned decision-making.
📝 Abstract
Many important scientific problems involve multivariate optimization coupled with slow and laborious experimental measurements. These complex, high-dimensional searches can be defined by non-convex optimization landscapes that resemble needle-in-a-haystack surfaces, leading to entrapment in local minima. Contextualizing optimizers with human domain knowledge is a powerful approach to guide searches to localized fruitful regions. However, this approach is susceptible to human confirmation bias and it is also challenging for domain experts to keep track of the rapidly expanding scientific literature. Here, we propose the use of Large Language Models (LLMs) for contextualizing Bayesian optimization (BO) via a hybrid optimization framework that intelligently and economically blends stochastic inference with domain knowledge-based insights from the LLM, which is used to suggest new, better-performing areas of the search space for exploration. Our method fosters user engagement by offering real-time commentary on the optimization progress, explaining the reasoning behind the search strategies. We validate the effectiveness of our approach on synthetic benchmarks with up to 15 independent variables and demonstrate the ability of LLMs to reason in four real-world experimental tasks where context-aware suggestions boost optimization performance substantially.