Multi-Agent LLMs for Adaptive Acquisition in Bayesian Optimization

📅 2026-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) struggle to explicitly model the exploration–exploitation trade-off in Bayesian optimization, leading to unstable and uncontrollable search behavior. This work proposes a multi-agent framework that decouples this trade-off into two distinct roles: policy coordination and candidate generation. The coordinator dynamically assigns interpretable weights to acquisition functions, while the generator conditionally produces candidate solutions based on these weights. By introducing a collaborative multi-agent mechanism, the approach explicitly governs the exploration–exploitation process for the first time, overcoming the cognitive limitations of single-agent prompting and enhancing both interpretability and stability. Experimental results demonstrate that the proposed framework significantly outperforms existing LLM-based methods across multiple continuous black-box optimization benchmarks, effectively avoiding premature convergence and achieving more efficient and robust search performance.
📝 Abstract
The exploration-exploitation trade-off is central to sequential decision-making and black-box optimization, yet how Large Language Models (LLMs) reason about and manage this trade-off remains poorly understood. Unlike Bayesian Optimization, where exploration and exploitation are explicitly encoded through acquisition functions, LLM-based optimization relies on implicit, prompt-based reasoning over historical evaluations, making search behavior difficult to analyze or control. In this work, we present a metric-level study of LLM-mediated search policy learning, studying how LLMs construct and adapt exploration-exploitation strategies under multiple operational definitions of exploration, including informativeness, diversity, and representativeness. We show that single-agent LLM approaches, which jointly perform strategy selection and candidate generation within a single prompt, suffer from cognitive overload, leading to unstable search dynamics and premature convergence. To address this limitation, we propose a multi-agent framework that decomposes exploration-exploitation control into strategic policy mediation and tactical candidate generation. A strategy agent assigns interpretable weights to multiple search criteria, while a generation agent produces candidates conditioned on the resulting search policy defined as weights. This decomposition renders exploration-exploitation decisions explicit, observable, and adjustable. Empirical results across various continuous optimization benchmarks indicate that separating strategic control from candidate generation substantially improves the effectiveness of LLM-mediated search.
Problem

Research questions and friction points this paper is trying to address.

exploration-exploitation trade-off
Large Language Models
Bayesian Optimization
search policy
multi-agent
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent LLMs
Exploration-Exploitation Trade-off
Bayesian Optimization
Adaptive Acquisition
Search Policy Decomposition
🔎 Similar Papers
No similar papers found.
A
Andrea Carbonati
University of Illinois Chicago, IL, USA
M
Mohammadsina Almasi
University of Illinois Chicago, IL, USA
Hadis Anahideh
Hadis Anahideh
Assistant Professor @ University of Illinois at Chicago
Black-box OptimizationActive LearningApplied Machine LearningAlgorithmic FairnessXAI