🤖 AI Summary
To address the mismatch between the dynamic memory demands of HPC workloads and cloud-native vertical scaling mechanisms in containerized environments, this paper proposes an adaptive scheduling framework for node-level memory elasticity. The method integrates HPC memory consumption pattern modeling with Kubernetes’ runtime resource hot-update capability—enabling, for the first time, eviction-free, fine-grained memory scaling. Built upon the Vertical Pod Autoscaler (VPA) extension architecture, the framework incorporates a memory behavior analysis model, real-time per-Pod memory hot-updates, and node-level resource coordination. Experimental evaluation demonstrates that, compared to standard VPA, the framework reduces average memory usage by 37%, eliminates out-of-memory (OOM) failures entirely, and robustly supports elastic execution of nine representative HPC applications.
📝 Abstract
Existing state-of-the-art vertical autoscalers for containerized environments are traditionally built for cloud applications, which might behave differently than HPC workloads with their dynamic resource consumption. In these environments, autoscalers may create an inefficient resource allocation. This work analyzes nine representative HPC applications with different memory consumption patterns. Our results identify the limitations and inefficiencies of the Kubernetes Vertical Pod Autoscaler (VPA) for enabling memory elastic execution of HPC applications. We propose, implement, and evaluate ARC-V. This policy leverages both in-flight resource updates of pods in Kubernetes and the knowledge of memory consumption patterns of HPC applications for achieving elastic memory resource provisioning at the node level. Our results show that ARC-V can effectively save memory while eliminating out-of-memory errors compared to the standard Kubernetes VPA.