🤖 AI Summary
To address the sluggish and inefficient resource scaling responses of edge-cloud systems under dynamic workloads, this paper proposes an in-situ scaling engine based on multi-agent reinforcement learning (MARL). It is the first to introduce a MARL framework for in-situ scheduling in edge-cloud environments, enabling distributed agents to collaboratively make real-time scaling decisions guided by microservice performance feedback. The engine employs a dual-algorithm architecture integrating Deep Q-Networks (DQN) and Proximal Policy Optimization (PPO), jointly optimizing for low latency and high resource efficiency. Experimental results demonstrate that, compared to conventional heuristic strategies, the proposed approach reduces average scaling response latency by 23.6% and improves CPU utilization by 31.4%, thereby significantly enhancing system adaptability and scalability.
📝 Abstract
Modern edge-cloud systems face challenges in efficiently scaling resources to handle dynamic and unpredictable workloads. Traditional scaling approaches typically rely on static thresholds and predefined rules, which are often inadequate for optimizing resource utilization and maintaining performance in distributed and dynamic environments. This inefficiency hinders the adaptability and performance required in edge-cloud infrastructures, which can only be achieved through the newly proposed in-place scaling. To address this problem, we propose the Multi-Agent Reinforcement Learning-based In-place Scaling Engine (MARLISE) that enables seamless, dynamic, reactive control with in-place resource scaling. We develop our solution using two Deep Reinforcement Learning algorithms: Deep Q-Network (DQN), and Proximal Policy Optimization (PPO). We analyze each version of the proposed MARLISE solution using dynamic workloads, demonstrating their ability to ensure low response times of microservices and scalability. Our results show that MARLISE-based approaches outperform heuristic method in managing resource elasticity while maintaining microservice response times and achieving higher resource efficiency.