🤖 AI Summary
Database configuration tuning faces three major challenges: vast parameter search space, slow convergence, and poor transferability across hardware platforms and workloads. This paper proposes an LLM-guided, three-stage hybrid tuning framework. First, a large language model extracts domain-specific knowledge from official documentation to recommend initial parameter values and reduce the high-dimensional search space. Second, Latin Hypercube Sampling (LHS) combined with a shared sample pool establishes an offline warm-start mechanism. Third, the TD3 reinforcement learning algorithm performs online fine-tuning. The framework significantly improves training efficiency and generalization: it achieves an average 37.1% performance gain across diverse workloads, up to 73% in TPC-C, and converges within only 30 online tuning steps—substantially outperforming existing RL-based methods. Its core contribution is the first integration of LLMs into the closed-loop database tuning pipeline, enabling knowledge-driven, efficient, and adaptive optimization.
📝 Abstract
Configuration tuning is critical for database performance. Although recent advancements in database tuning have shown promising results in throughput and latency improvement, challenges remain. First, the vast knob space makes direct optimization unstable and slow to converge. Second, reinforcement learning pipelines often lack effective warm-start guidance and require long offline training. Third, transferability is limited: when hardware or workloads change, existing models typically require substantial retraining to recover performance.
To address these limitations, we propose L2T-Tune, a new LLM-guided hybrid database tuning framework that features a three-stage pipeline: Stage one performs a warm start that simultaneously generates uniform samples across the knob space and logs them into a shared pool; Stage two leverages a large language model to mine and prioritize tuning hints from manuals and community documents for rapid convergence. Stage three uses the warm-start sample pool to reduce the dimensionality of knobs and state features, then fine-tunes the configuration with the Twin Delayed Deep Deterministic Policy Gradient algorithm.
We conduct experiments on L2T-Tune and the state-of-the-art models. Compared with the best-performing alternative, our approach improves performance by an average of 37.1% across all workloads, and by up to 73% on TPC-C. Compared with models trained with reinforcement learning, it achieves rapid convergence in the offline tuning stage on a single server. Moreover, during the online tuning stage, it only takes 30 steps to achieve best results.