A Reinforcement Learning-Driven Task Scheduling Algorithm for Multi-Tenant Distributed Systems

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of jointly managing dynamic resource fluctuations, heterogeneous tenant requirements, and fairness guarantees in multi-tenant distributed systems, this paper proposes a reinforcement learning–based adaptive task scheduling framework. We formulate scheduling as a Markov decision process and design a multi-objective reward function integrating task latency, resource utilization, and tenant fairness. Using the Proximal Policy Optimization (PPO) algorithm, we jointly train policy and value networks to enhance training stability and cross-scenario generalization. Evaluated on real-data-driven multi-tenant workloads, our approach significantly outperforms state-of-the-art schedulers: it reduces average task latency by 23.6%, improves resource utilization by 18.4%, and increases the Jain’s fairness index by 0.31 across tenants. The framework balances theoretical rigor with practical deployability, offering a principled yet scalable solution for fair and efficient resource orchestration in dynamic multi-tenant environments.

Technology Category

Application Category

📝 Abstract
This paper addresses key challenges in task scheduling for multi-tenant distributed systems, including dynamic resource variation, heterogeneous tenant demands, and fairness assurance. An adaptive scheduling method based on reinforcement learning is proposed. By modeling the scheduling process as a Markov decision process, the study defines the state space, action space, and reward function. A scheduling policy learning framework is designed using Proximal Policy Optimization (PPO) as the core algorithm. This enables dynamic perception of complex system states and real-time decision-making. Under a multi-objective reward mechanism, the scheduler jointly optimizes task latency, resource utilization, and tenant fairness. The coordination between the policy network and the value network continuously refines the scheduling strategy. This enhances overall system performance. To validate the effectiveness of the proposed method, a series of experiments were conducted in multi-scenario environments built using a real-world public dataset. The experiments evaluated task latency control, resource efficiency, policy stability, and fairness. The results show that the proposed method outperforms existing scheduling approaches across multiple evaluation metrics. It demonstrates strong stability and generalization ability. The proposed scheduling framework provides practical and engineering value in policy design, dynamic resource modeling, and multi-tenant service assurance. It effectively improves scheduling efficiency and resource management in distributed systems under complex conditions.
Problem

Research questions and friction points this paper is trying to address.

Dynamic task scheduling in multi-tenant distributed systems
Balancing latency, resource use, and tenant fairness
Reinforcement learning for adaptive scheduling decisions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning for dynamic task scheduling
PPO algorithm optimizes latency and fairness
Multi-objective reward enhances system performance
🔎 Similar Papers
No similar papers found.
Xiaopei Zhang
Xiaopei Zhang
Machine Learning Engineer
Image SegmentationMultimodal Data FusionAI Infrastructure & EfficiencyAutomation & Robotics
X
Xingang Wang
Institute of Automation, Chinese Academy of Sciences
X
Xin Wang
University of the Chinese Academy of Sciences