LLM4Perf: Large Language Models Are Effective Samplers for Multi-Objective Performance Modeling (Copy)

📅 2025-12-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Software systems face challenges in multi-objective performance modeling due to vast configuration spaces and low sampling efficiency. Method: This paper proposes LLM4Perf—the first large language model (LLM)-based feedback-driven collaborative sampling framework. It innovatively integrates semantic information from configuration documentation with runtime performance feedback to enable dynamic configuration space pruning and online optimization of sampling strategies. Contribution/Results: Unlike conventional approaches, LLM4Perf empirically demonstrates, for the first time, the LLM’s generalizable pruning capability in performance modeling—significantly enhancing multiple baseline methods. Across 112 evaluation scenarios, it achieves optimal performance in 68.8%; across 448 baseline experiments, 91.5% show performance improvement attributable to its pruning mechanism. This work establishes a reproducible framework and robust empirical foundation for LLM-enabled performance engineering.

Technology Category

Application Category

📝 Abstract
The performance of modern software systems is critically dependent on their complex configuration options. Building accurate performance models to navigate this vast space requires effective sampling strategies, yet existing methods often struggle with multi-objective optimization and cannot leverage semantic information from documentation. The recent success of Large Language Models (LLMs) motivates the central question of this work: Can LLMs serve as effective samplers for multi-objective performance modeling? To explore this, we present a comprehensive empirical study investigating the capabilities and characteristics of LLM-driven sampling. We design and implement LLM4Perf, a feedback-based framework, and use it to systematically evaluate the LLM-guided sampling process across four highly configurable, real-world systems. Our study reveals that the LLM-guided approach outperforms traditional baselines in most cases. Quantitatively, LLM4Perf achieves the best performance in nearly 68.8% (77 out of 112) of all evaluation scenarios, demonstrating its superior effectiveness. We find this effectiveness stems from the LLM's dual capabilities of configuration space pruning and feedback-driven strategy refinement. The effectiveness of this pruning is further validated by the fact that it also improves the performance of the baseline methods in nearly 91.5% (410 out of 448) of cases. Furthermore, we show how the LLM choices for each component and hyperparameters within LLM4Perf affect its effectiveness. Overall, this paper provides strong evidence for the effectiveness of LLMs in performance engineering and offers concrete insights into the mechanisms that drive their success.
Problem

Research questions and friction points this paper is trying to address.

LLMs sample configurations for multi-objective performance modeling
LLM4Perf framework outperforms traditional sampling methods
LLMs prune configuration space and refine strategies via feedback
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven sampling for multi-objective performance modeling
Feedback-based framework LLM4Perf for configuration space pruning
LLM-guided approach outperforms traditional baselines in evaluation
🔎 Similar Papers
No similar papers found.
X
Xin Wang
The Hong Kong University of Science and Technology (Guangzhou)
Z
Zhenhao Li
York University
Zishuo Ding
Zishuo Ding
The Hong Kong University of Science and Technology (Guangzhou)
Software Engineering