$ exttt{LLINBO}$: Trustworthy LLM-in-the-Loop Bayesian Optimization

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Integrating large language models (LLMs) with Bayesian optimization (BO) suffers from the absence of explicit surrogate modeling, inadequate uncertainty calibration, and opaque mechanisms—leading to poor exploration-exploitation trade-offs and weak theoretical traceability. Method: This paper proposes a trustworthy BO framework synergizing LLMs with Gaussian processes (GPs), introducing the first LLM-GP coupling mechanism. It designs three provably convergent collaborative modules that enable high-quality, LLM-driven exploration while ensuring robust, GP-governed exploitation in a closed loop. The framework integrates calibrated uncertainty quantification, context-aware prompt engineering, and theory-guided query generation. Results: Evaluated on a real-world 3D printing process optimization task, the method achieves a 42% faster convergence rate and a 31% improvement in optimal solution quality over standalone LLMs or standard BO. Moreover, it provides the first rigorous regret bound guarantee for LLM-augmented BO.

Technology Category

Application Category

📝 Abstract

Bayesian optimization (BO) is a sequential decision-making tool widely used for optimizing expensive black-box functions. Recently, Large Language Models (LLMs) have shown remarkable adaptability in low-data regimes, making them promising tools for black-box optimization by leveraging contextual knowledge to propose high-quality query points. However, relying solely on LLMs as optimization agents introduces risks due to their lack of explicit surrogate modeling and calibrated uncertainty, as well as their inherently opaque internal mechanisms. This structural opacity makes it difficult to characterize or control the exploration-exploitation trade-off, ultimately undermining theoretical tractability and reliability. To address this, we propose LLINBO: LLM-in-the-Loop BO, a hybrid framework for BO that combines LLMs with statistical surrogate experts (e.g., Gaussian Processes (GP)). The core philosophy is to leverage contextual reasoning strengths of LLMs for early exploration, while relying on principled statistical models to guide efficient exploitation. Specifically, we introduce three mechanisms that enable this collaboration and establish their theoretical guarantees. We end the paper with a real-life proof-of-concept in the context of 3D printing. The code to reproduce the results can be found at https://github.com/UMDataScienceLab/LLM-in-the-Loop-BO.

Problem

Research questions and friction points this paper is trying to address.

Combining LLMs with Bayesian Optimization for trustworthy black-box optimization

Addressing LLMs' lack of surrogate modeling and uncertainty calibration

Balancing exploration-exploitation trade-off in hybrid LLM-statistical frameworks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines LLMs with Bayesian Optimization

Uses Gaussian Processes for surrogate modeling

Balances exploration and exploitation theoretically

🔎 Similar Papers

Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique