Resource Management Schemes for Cloud-Native Platforms with Computing Containers of Docker and Kubernetes

📅 2020-10-20
🏛️ arXiv.org
📈 Citations: 24
Influential: 1
📄 PDF

career value

226K/year
🤖 AI Summary
This study addresses prolonged task completion times, low resource utilization, and high resource release latency in Docker/Kubernetes containers on cloud-native platforms running compute-intensive workloads (e.g., big data and deep learning). We systematically evaluate the performance impact of diverse resource scheduling strategies through system-level monitoring—leveraging cgroups and metrics-server—and multi-workload stress testing. For the first time, we empirically quantify how key resource configurations significantly affect task completion time (±79.4% variation) and resource release latency (+116.7% degradation). Based on these findings, we propose an evidence-driven configuration optimization paradigm that reduces maximum task completion time by up to 79.4% and precisely identifies configuration bottlenecks responsible for latency. Our results provide reproducible, transferable empirical foundations for resource management tuning and deployment decisions in cloud-native environments.
📝 Abstract
Businesses have made increasing adoption and incorporation of cloud technology into internal processes in the last decade. The cloud-based deployment provides on-demand availability without active management. More recently, the concept of cloud-native application has been proposed and represents an invaluable step toward helping organizations develop software faster and update it more frequently to achieve dramatic business outcomes. Cloud-native is an approach to build and run applications that exploit the cloud computing delivery model's advantages. It is more about how applications are created and deployed than where. The container-based virtualization technology, such as Docker and Kubernetes, serves as the foundation for cloud-native applications. This paper investigates the performance of two popular computational-intensive applications, big data, and deep learning, in a cloud-native environment. We analyze the system overhead and resource usage for these applications. Through extensive experiments, we show that the completion time reduces by up to 79.4% by changing the default setting and increases by up to 96.7% due to different resource management schemes on two platforms. Additionally, the resource release is delayed by up to 116.7% across different systems. Our work can guide developers, administrators, and researchers to better design and deploy their applications by selecting and configuring a hosting platform.
Problem

Research questions and friction points this paper is trying to address.

Evaluating performance of big data and deep learning applications
Analyzing system overhead and resource usage in cloud-native environments
Investigating resource management schemes for Docker and Kubernetes platforms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizing Docker and Kubernetes container resource management
Analyzing performance of computational-intensive cloud-native applications
Evaluating resource usage and system overhead reduction
Y
Ying Mao
Department of Computer and Information Science at Fordham University in the New York City
Y
Yuqi Fu
Department of Computer and Information Science at Fordham University in the New York City
S
Suwen Gu
Department of Computer and Information Science at Fordham University in the New York City
Sudip Vhaduri
Sudip Vhaduri
Assistant Professor, Purdue University | Director, mAI Lab
ML/AI(GenAI)Mobile/Wearable ComputingmHealthBiometric/IoT AuthenticationEducation
L
Long Cheng
School of Computing, Dublin City University, Ireland
Q
Qingzhi Liu
Information Technology Group, Wageningen University, The Netherlands