Towards VM Rescheduling Optimization Through Deep Reinforcement Learning

📅 2025-03-30

🏛️ European Conference on Computer Systems

📈 Citations: 1

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Large-scale data centers face severe physical machine resource fragmentation due to dynamic VM creation and termination, undermining resource utilization and system scalability. Traditional VM rescheduling approaches suffer from high inference latency and poor adaptability to runtime state changes, limiting their practical deployment. Method: We propose VMR2L, a low-latency, highly adaptive reinforcement learning framework for online VM consolidation. VMR2L innovatively integrates a two-stage decision architecture, relation-aware graph-based feature extraction, and a risk-preference assessment mechanism. Its constraint-aware policy network jointly optimizes inference latency and scheduling accuracy. Contribution/Results: Evaluated on an industrial-scale dataset, VMR2L achieves near-optimal scheduling solutions with single-inference latency of only several seconds—over 100× faster than conventional methods—while significantly improving fragmentation mitigation efficiency and system responsiveness.

Technology Category

Application Category

📝 Abstract

Modern industry-scale data centers need to manage a large number of virtual machines (VMs). Due to the continual creation and release of VMs, many small resource fragments are scattered across physical machines (PMs). To handle these fragments, data centers periodically reschedule some VMs to alternative PMs, a practice commonly referred to as VM rescheduling. Despite the increasing importance of VM rescheduling as data centers grow in size, the problem remains understudied. We first show that, unlike most combinatorial optimization tasks, the inference time of VM rescheduling algorithms significantly influences their performance, due to dynamic VM state changes during this period. This causes existing methods to scale poorly. Therefore, we develop a reinforcement learning system for VM rescheduling, VMR2L, which incorporates a set of customized techniques, such as a two-stage framework that accommodates diverse constraints and workload conditions, a feature extraction module that captures relational information specific to rescheduling, as well as a risk-seeking evaluation enabling users to optimize the trade-off between latency and accuracy. We conduct extensive experiments with data from an industry-scale data center. Our results show that VMR2L can achieve a performance comparable to the optimal solution but with a running time of seconds. Code12 and datasets3 are open-sourced.

Problem

Research questions and friction points this paper is trying to address.

Optimizing VM rescheduling in data centers efficiently

Reducing inference time impact on VM rescheduling performance

Balancing latency and accuracy in dynamic VM management

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep reinforcement learning for VM rescheduling

Two-stage framework for diverse constraints

Risk-seeking evaluation for latency-accuracy trade-off

🔎 Similar Papers

No similar papers found.