TD3-Sched: Learning to Orchestrate Container-based Cloud-Edge Resources via Distributed Reinforcement Learning

📅 2025-09-23

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

In cloud-edge collaborative environments, edge nodes suffer from resource constraints and strict latency sensitivity, rendering centralized schedulers prone to performance bottlenecks and SLO violations. To address this, we propose TD3-Sched—the first distributed reinforcement learning scheduler for cloud-edge orchestration, built upon Twin Delayed Deep Deterministic Policy Gradient (TD3). It enables decentralized, continuous-action-space joint optimization of CPU and memory allocation. By pioneering the integration of distributed RL into cloud-edge scheduling, TD3-Sched achieves real-time, adaptive decision-making under dynamic workloads. Evaluated on a real-world testbed and Alibaba Cloud production scheduling traces, TD3-Sched reduces end-to-end latency by 16%–38.6% over state-of-the-art baselines, achieves an SLO violation rate of only 0.47%, converges faster, and delivers significantly more stable service quality.

Technology Category

Application Category

📝 Abstract

Resource scheduling in cloud-edge systems is challenging as edge nodes run latency-sensitive workloads under tight resource constraints, while existing centralized schedulers can suffer from performance bottlenecks and user experience degradation. To address the issues of distributed decisions in cloud-edge environments, we present TD3-Sched, a distributed reinforcement learning (DRL) scheduler based on Twin Delayed Deep Deterministic Policy Gradient (TD3) for continuous control of CPU and memory allocation, which can achieve optimized decisions for resource provisioning under dynamic workloads. On a realistic cloud-edge testbed with SockShop application and Alibaba traces, TD3-Sched achieves reductions of 17.9% to 38.6% in latency under same loads compared with other reinforcement-learning and rule-based baselines, and 16% to 31.6% under high loads. TD3-Sched also shows superior Service Level Objective (SLO) compliance with only 0.47% violations. These results indicate faster convergence, lower latency, and more stable performance while preserving service quality in container-based cloud-edge environment compared with the baselines.

Problem

Research questions and friction points this paper is trying to address.

Resource scheduling in cloud-edge systems with latency-sensitive workloads

Distributed decision-making challenges under tight resource constraints

Optimizing CPU and memory allocation for dynamic container workloads

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed reinforcement learning for cloud-edge scheduling

TD3 algorithm for continuous CPU and memory control

Optimized resource provisioning under dynamic workloads

🔎 Similar Papers

No similar papers found.