🤖 AI Summary
Existing benchmarks (e.g., TPC-H, TPC-DS) fail to capture distributional shifts and dynamic evolution characteristic of real production workloads, hindering the development and evaluation of learned database components. This paper introduces Redbench—the first high-fidelity benchmark built from 30 real-world cloud service production query workloads. Through systematic collection, statistical modeling, and pattern categorization, Redbench is the first to faithfully align with and reproduce the observed query distribution characteristics and temporal evolution patterns in Redset. It incorporates a workload alignment mechanism that ensures fidelity across critical dimensions: distributional shift, hotspot drift, and long-tail structure. Redbench supports reproducible query sampling and comprehensive workload feature analysis, significantly improving training effectiveness and generalization capability of learned components—such as indexes and query optimizers—in realistic deployment scenarios.
📝 Abstract
Instance-optimized components have made their way into production systems. To some extent, this adoption is due to the characteristics of customer workloads, which can be individually leveraged during the model training phase. However, there is a gap between research and industry that impedes the development of realistic learned components: the lack of suitable workloads. Existing ones, such as TPC-H and TPC-DS, and even more recent ones, such as DSB and CAB, fail to exhibit real workload patterns, particularly distribution shifts. In this paper, we introduce Redbench, a collection of 30 workloads that reflect query patterns observed in the real world. The workloads were obtained by sampling queries from support benchmarks and aligning them with workload characteristics observed in Redset.