Multi-Agent Reinforcement Learning for Adaptive Resource Orchestration in Cloud-Native Clusters

📅 2025-08-13

📈 Citations: 0

✨ Influential: 0

career value

248K/year

🤖 AI Summary

In cloud-native databases, highly dynamic resource states and complex scheduling lead to low coordination efficiency and unstable policies. To address this, we propose an adaptive resource orchestration method based on multi-agent reinforcement learning (MARL). Our approach features: (1) a heterogeneous role-based agent architecture, where compute, storage, and other resource entities are endowed with distinct, specialized policy capabilities; and (2) a reward shaping mechanism that jointly leverages local observations and global feedback to mitigate policy bias arising from partial observability and improve convergence stability. Evaluated on real production workloads, our method achieves significant improvements: +18.3% in resource utilization, −32.7% in average scheduling latency, 2.1× faster policy convergence, and enhanced system stability, fairness, and cross-workload generalization.

Technology Category

Application Category

📝 Abstract

This paper addresses the challenges of high resource dynamism and scheduling complexity in cloud-native database systems. It proposes an adaptive resource orchestration method based on multi-agent reinforcement learning. The method introduces a heterogeneous role-based agent modeling mechanism. This allows different resource entities, such as compute nodes, storage nodes, and schedulers, to adopt distinct policy representations. These agents are better able to reflect diverse functional responsibilities and local environmental characteristics within the system. A reward-shaping mechanism is designed to integrate local observations with global feedback. This helps mitigate policy learning bias caused by incomplete state observations. By combining real-time local performance signals with global system value estimation, the mechanism improves coordination among agents and enhances policy convergence stability. A unified multi-agent training framework is developed and evaluated on a representative production scheduling dataset. Experimental results show that the proposed method outperforms traditional approaches across multiple key metrics. These include resource utilization, scheduling latency, policy convergence speed, system stability, and fairness. The results demonstrate strong generalization and practical utility. Across various experimental scenarios, the method proves effective in handling orchestration tasks with high concurrency, high-dimensional state spaces, and complex dependency relationships. This confirms its advantages in real-world, large-scale scheduling environments.

Problem

Research questions and friction points this paper is trying to address.

Addresses high dynamism and complexity in cloud-native resource scheduling

Proposes multi-agent reinforcement learning for adaptive orchestration

Enhances coordination and convergence in heterogeneous resource systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous role-based agent modeling mechanism

Reward-shaping with local-global feedback integration

Unified multi-agent training framework

🔎 Similar Papers

Carbon Footprint Reduction for Sustainable Data Centers in Real-Time