Data analysis of cloud virtualization experiments

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the unpredictable end-to-end latency in cloud virtualized environments, which stems from virtualization overheads in CPU, I/O, and network resources. Through systematic network measurement experiments across diverse virtualization platforms—including KVM, LXC, and Docker—under multidimensional workload conditions, the authors collect packet round-trip time data to construct a high-quality dataset suitable for machine learning–based network performance modeling. By integrating data preprocessing, correlation analysis, dimensionality reduction, and clustering techniques, this work presents the first quantitative evaluation of latency impacts across multiple virtualization technologies. The resulting dataset effectively supports network performance prediction and intelligent resource scheduling, providing an empirical foundation for performance optimization in cloud environments.

Technology Category

Application Category

📝 Abstract
The cloud computing paradigm underlines data center and telecommunication infrastructure design. Heavily leveraging virtualization, it slices hardware and software resources into smaller software units for greater flexibility of manipulation. Given the considerable benefits, several virtualization forms, with varying processing and communication overheads, emerged, including Full Virtualization and OS Virtualization. As a result, predicting packet throughput at the data plane turns out to be more challenging due to the additional virtualization overhead located at CPU, I/O, and network resources. This research presents a dataset of active network measurements data collected while varying various network parameters, including CPU affinity, frequency of echo packet injection, type of virtual network driver, use of CPU, I/O, or network load, and the number of concurrent VMs. The virtualization technologies used in the study include KVM, LXC, and Docker. The work examines their impact on a key network metric, namely, end-to-end latency. Also, it builds data models to evaluate the impact of a cloud computing environment on packet round-trip time. To explore data visualization, the dataset was submitted to pre-processing, correlation analysis, dimensionality reduction, and clustering. In addition, this paper provides a brief analysis of the dataset, demonstrating its use in developing machine learning-based systems for administrator decision-making.
Problem

Research questions and friction points this paper is trying to address.

virtualization overhead
packet throughput
end-to-end latency
cloud computing
network performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

virtualization overhead
network latency modeling
cloud measurement dataset
machine learning for cloud management
end-to-end latency analysis
🔎 Similar Papers
No similar papers found.
P
P. R. X. D. Carmo
Centro de Informática (CIn), Grupo de Pesquisa em Redes e Telecomunicações (GPRT), Universidade Federal de Pernambuco (UFPE), Recife, Brasil
E
Eduardo Freitas
Centro de Informática (CIn), Grupo de Pesquisa em Redes e Telecomunicações (GPRT), Universidade Federal de Pernambuco (UFPE), Recife, Brasil
A
A. T. O. Filho
Centro de Informática (CIn), Grupo de Pesquisa em Redes e Telecomunicações (GPRT), Universidade Federal de Pernambuco (UFPE), Recife, Brasil
Judith Kelner
Judith Kelner
Universidade Federal de Pernambuco
D
D. Sadok
Centro de Informática (CIn), Grupo de Pesquisa em Redes e Telecomunicações (GPRT), Universidade Federal de Pernambuco (UFPE), Recife, Brasil