LABO: LLM-Accelerated Bayesian Optimization through Broad Exploration and Selective Experimentation

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the challenge of high experimental costs and scarce data in scientific discovery, where existing Bayesian optimization methods fail to leverage the low-cost evaluation capability of large language models (LLMs). The authors propose LABO, a framework that integrates LLM predictions with real experimental observations within a single optimization loop. LABO employs an innovative dynamic gating mechanism to adaptively balance broad exploration guided by the LLM and selective exploitation through physical experiments. By synergistically combining LLMs, Bayesian optimization, and uncertainty quantification, the method provides a theoretical cumulative regret bound and demonstrates superior performance across diverse scientific tasks under identical experimental budgets, achieving remarkable sample efficiency and practical utility.

📝 Abstract

The high cost and data scarcity in scientific exploration have motivated the use of large language models (LLMs) as knowledge-driven components in Bayesian optimization (BO). However, existing approaches typically embed LLMs directly into the sampling or surrogate modeling pipeline, without fully leveraging their significantly lower evaluation cost compared to real-world experiments. To address this limitation, we propose LLM-Accelerated Bayesian Optimization (LABO), a framework that combines LLM predictions with experimental observations within a single BO loop. LABO employs a gating criterion to dynamically balance the reliance on LLM predictions versus actual experiments. By leveraging inexpensive LLM evaluations to broadly explore the search space and reserving costly real experiments only for regions with high uncertainty, LABO achieves more sample-efficient optimization. We provide a theoretical analysis with a cumulative regret bound that formalizes this efficiency gain. Empirical results across diverse scientific tasks demonstrate that LABO consistently outperforms existing methods under identical experimental budgets. Our results suggest that LABO offers a practical and theoretically grounded approach for integrating LLMs into scientific discovery workflows.

Problem

Research questions and friction points this paper is trying to address.

Bayesian Optimization

Large Language Models

Sample Efficiency

Scientific Discovery

Experimental Cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian Optimization

Large Language Models

Sample Efficiency