Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning

๐Ÿ“… 2023-09-25
๐Ÿ›๏ธ 2023 IEEE International Conference on Cloud Engineering (IC2E)
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the limitations of traditional cluster schedulers that rely on fixed-weight scoring functions, which struggle to adapt to diverse workloads and consequently constrain both resource utilization and job performance. To overcome this, the authors propose a reinforcement learningโ€“based approach for dynamically tuning scoring weights. The method employs a multi-step parameter optimization framework with a reward mechanism based on percentage improvements, uses frame stacking to convey historical scheduling information, and restricts domain-specific features to enhance generalization. Trained jointly across multiple workloads and cluster configurations, the approach demonstrates strong adaptability to unseen environments. Experimental results in serverless clusters show that it improves end-to-end job performance by 33% on average over fixed-weight baselines and outperforms the best baseline by 12%.

Technology Category

Application Category

๐Ÿ“ Abstract
Efficiently allocating incoming jobs to nodes in large-scale clusters can lead to substantial improvements in both cluster utilization and job performance. In order to allocate incoming jobs, cluster schedulers usually rely on a set of scoring functions to rank feasible nodes. Results from individual scoring functions are usually weighted equally, which could lead to suboptimal deployments as the one-size-fits-all solution does not take into account the characteristics of each workload. Tuning the weights of scoring functions, however, requires expert knowledge and is computationally expensive.This paper proposes a reinforcement learning approach for learning the weights in scheduler scoring algorithms with the overall objective of improving the end-to-end performance of jobs for a given cluster. Our approach is based on percentage improvement reward, frame-stacking, and limiting domain information. We propose a percentage improvement reward to address the objective of multi-step parameter tuning. The inclusion of frame-stacking allows for carrying information across an optimization experiment. Limiting domain information prevents overfitting and improves performance in unseen clusters and workloads. The policy is trained on different combinations of workloads and cluster setups. We demonstrate the proposed approach improves performance on average by 33% compared to fixed weights and 12% compared to the best-performing baseline in a lab-based serverless scenario.
Problem

Research questions and friction points this paper is trying to address.

cluster scheduling
scoring functions
weight tuning
job allocation
reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

reinforcement learning
cluster scheduling
scoring function tuning
frame-stacking
generalization
๐Ÿ”Ž Similar Papers
No similar papers found.
Martin Asenov
Martin Asenov
Parexel AI Labs
Machine LearningDistributed SystemRobotics
Q
Qiwen Deng
Edinburgh Research Centre, Central Software Institute, Huawei
G
Gingfung Yeung
Edinburgh Research Centre, Central Software Institute, Huawei
Adam Barker
Adam Barker
Professor of Computer Science, University of St Andrews
Computer SystemsCloud ComputingDistributed Systems