Evolution of Benchmark: Black-Box Optimization Benchmark Design through Large Language Model

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional black-box optimization benchmarks are often manually designed, rendering them susceptible to expert bias and limited in diversity, thereby hindering objective algorithm evaluation. This work proposes a novel automated benchmark generation framework that, for the first time, integrates large language models with program evolution through a bi-objective optimization approach. By leveraging prompt engineering and a reflective co-evolutionary mechanism, the framework iteratively synthesizes test functions that simultaneously exhibit diverse fitness landscapes and high discriminative power among optimization algorithms. The generated benchmarks demonstrate strong effectiveness and generalization across multiple scenarios, including algorithm performance assessment, training of learning-augmented optimizers, and surrogate modeling for computationally expensive real-world problems. This significantly enhances the objectivity, diversity, and practical utility of optimization benchmarks.

Technology Category

Application Category

📝 Abstract
Benchmark Design in Black-Box Optimization (BBO) is a fundamental yet open-ended topic. Early BBO benchmarks are predominantly human-crafted, introducing expert bias and constraining diversity. Automating this design process can relieve the human-in-the-loop burden while enhancing diversity and objectivity. We propose Evolution of Benchmark (EoB), an automated BBO benchmark designer empowered by the large language model (LLM) and its program evolution capability. Specifically, we formulate benchmark design as a bi-objective optimization problem towards maximizing (i) landscape diversity and (ii) algorithm-differentiation ability across a portfolio of BBO solvers. Under this paradigm, EoB iteratively prompts LLM to evolve a population of benchmark programs and employs a reflection-based scheme to co-evolve the landscape and its corresponding program. Comprehensive experiments validate our EoB is a competitive candidate in multi-dimensional usages: 1) Benchmarking BBO algorithms; 2) Training and testing learning-assisted BBO algorithms; 3) Extending proxy for expensive real-world problems.
Problem

Research questions and friction points this paper is trying to address.

Black-Box Optimization
Benchmark Design
Landscape Diversity
Algorithm Differentiation
Automated Design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Black-Box Optimization
Large Language Model
Automated Benchmark Design
Program Evolution
Landscape Diversity
🔎 Similar Papers
No similar papers found.
C
Chen Wang
South China University of Technology
S
Sijie Ma
South China University of Technology
Zeyuan Ma
Zeyuan Ma
South China University of Technology
Meta-Black-Box OptimizationReinforcement LearningLearning to Optimize
Y
Yue-Jiao Gong
South China University of Technology