🤖 AI Summary
This work addresses the challenge of designing dynamic server allocation strategies in cloud computing. We propose a semi-Markov decision process (SMDP) analytical model to quantify, for the first time, the performance gap between several lightweight state-dependent routing policies and the theoretical optimum. Focusing on multi-site systems under a load-aware scheduling framework, we characterize the fundamental mechanism by which state information exploitation enhances the latency–cost trade-off in dynamic allocation. Our analysis shows that, under typical workloads, simple policies incur less than 15% degradation in response latency relative to the optimal policy—demonstrating strong engineering practicality. The study establishes a tractable, analytically solvable benchmark model for dynamic resource allocation and provides theoretical justification for deploying lightweight policies in large-scale cloud systems. It thus introduces a new paradigm for resource scheduling that jointly optimizes service quality and operational cost.
📝 Abstract
Cloud computing enables the dynamic provisioning of server resources. To exploit this opportunity, a policy is needed for dynamically allocating (and deallocating) servers in response to the current load conditions. In this paper we describe several simple policies for dynamic server allocation and develop analytic models for their analysis. We also design semi-Markov decision models that enable determination of the performance achieved with optimal policies, allowing us to quantify the performance gap between simple, easily implemented policies, and optimal policies. Finally, we apply our models to study the potential performance benefits of state-dependent routing in multi-site systems when using dynamic server allocation at each site. Insights from our results are valuable to service providers wanting to balance cloud service costs and delays.