Sampling in Cloud Benchmarking: A Critical Review and Methodological Guidelines

📅 2024-12-09
🏛️ 2024 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Cloud benchmarking suffers from significant performance variability due to resource contention, hardware heterogeneity, and network latency; critically, the sampling strategy—long overlooked in experimental design—further undermines result comparability, reproducibility, and reliability. This paper identifies three pervasive methodological flaws through a critical literature review and empirical trend analysis: (1) widespread use of non-probabilistic sampling, (2) overreliance on single benchmarks, and (3) inaccessible or undocumented samples. To address these issues, we propose the first standardized sampling methodology for cloud benchmarking, comprising three core components: principled sampling design guidelines, transparency requirements for reporting sampling procedures, and a verifiable evaluation framework. Our methodology fills a critical standardization gap in cloud benchmarking practice, offering actionable, implementation-ready guidance to enhance scientific rigor, cross-study comparability, and experimental reproducibility.

Technology Category

Application Category

📝 Abstract
Cloud benchmarks suffer from performance fluctuations caused by resource contention, network latency, hardware heterogeneity, and other factors along with decisions taken in the benchmark design. In particular, the sampling strategy of benchmark designers can significantly influence benchmark results. Despite this well-known fact, no systematic approach has been devised so far to make sampling results comparable and guide benchmark designers in choosing their sampling strategy for use within benchmarks. To identify systematic problems, we critically review sampling in recent cloud computing research. Our analysis identifies concerning trends: (i) a high prevalence of non-probability sampling, (ii) over-reliance on a single benchmark, and (iii) restricted access to samples. To address these issues and increase transparency in sampling, we propose methodological guidelines for researchers and reviewers. We hope that our work contributes to improving the generalizability, reproducibility, and reliability of research results.
Problem

Research questions and friction points this paper is trying to address.

Cloud benchmarking performance fluctuations
Lack of systematic sampling strategy
Non-probability sampling prevalence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Critical review of cloud sampling
Propose methodological sampling guidelines
Enhance research transparency and reliability