The Merit of Simple Policies: Buying Performance With Parallelism and System Architecture

📅 2025-03-20

📈 Citations: 0

✨ Influential: 0

career value

277K/year

🤖 AI Summary

This paper investigates the joint optimization of server count, scheduling policy, and system architecture under a fixed computational budget to minimize average job response time. Using high-resolution traces from Google Cloud production workloads, we develop a multi-stage server cluster model and systematically compare classical policies—including Join-Idle-Queue (JIQ) and Round-Robin (RR)—against state-of-the-art size-aware schedulers. Our findings reveal: (1) an optimal critical server scale that minimizes response time; (2) in high-parallelism or multi-tier architectures, RR and JIQ significantly outperform conventional size-aware policies; and (3) parallelism degree and architectural design exert greater influence on performance than scheduling algorithm sophistication. Collectively, these results establish a new optimization paradigm wherein “architecture–parallelism” dominates over “algorithmic refinement.”

Technology Category

Application Category

📝 Abstract

While scheduling and dispatching of computational workloads is a well-investigated subject, only recently has Google provided publicly a vast high-resolution measurement dataset of its cloud workloads. We revisit dispatching and scheduling algorithms fed by traffic workloads derived from those measurements. The main finding is that mean job response time attains a minimum as the number of servers of the computing cluster is varied, under the constraint that the overall computational budget is kept constant. Moreover, simple policies, such as Join Idle Queue, appear to attain the same performance as more complex, size-based policies for suitably high degrees of parallelism. Further, better performance, definitely outperforming size-based dispatching policies, is obtained by using multi-stage server clusters, even using very simple policies such as Round Robin. The takeaway is that parallelism and architecture of computing systems might be powerful knobs to control performance, even more than policies, under realistic workload traffic.

Problem

Research questions and friction points this paper is trying to address.

Optimizing job response time in cloud computing clusters.

Comparing simple vs. complex dispatching policies for workload scheduling.

Exploring the impact of parallelism and system architecture on performance.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes high-resolution cloud workload measurements

Employs simple dispatching policies like Join Idle Queue

Implements multi-stage server clusters for enhanced performance

🔎 Similar Papers

No similar papers found.