Cloud based DevOps Framework for Identifying Risk Factors of Hospital Utilization

📅 2025-04-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the poor reproducibility and limited scalability of analyses on sparse healthcare datasets—such as NHANES—when investigating hospitalization risk factors (e.g., diabetes, obesity, cardiovascular disease). We propose the first modular, generalizable, cloud-native DevOps analytics framework. Methodologically, it integrates CI/CD pipelines (GitHub Actions/Jenkins), cloud platforms (AWS/Azure), the Python scientific stack, data version control, and synthetic data generation to enable automated NHANES data updates, hybrid modeling, and end-to-end analysis. Key contributions include: (1) substantially improved analytical reliability and reproducibility; (2) seamless co-scaling of real and synthetic data; and (3) empirical validation of cross-domain transferability to other sparse-data domains—including environmental science and cybersecurity—establishing a general-purpose infrastructure for multi-domain health data analytics.

Technology Category

Application Category

📝 Abstract
A scalable and reliable system is required to analyze the National Health and Nutrition Examination Survey (NHANES) data efficiently to understand hospital utilization risk factors. This study aims to investigate the integration of continuous integration and deployment (CI/CD) practices in data science workflows, specifically focusing on analyzing NHANES data to identify the prevalence of diabetes, obesity, and cardiovascular diseases. An end-to-end cloud-based DevOps framework is proposed for data analysis which examines risk factors associated with hospital utilization and evaluates key hospital utilization metrics. We have also highlighted the modular structure of the framework that can be generalized for any other domains beyond healthcare. In the framework, an online data update method is provided which can be extended further using both real and synthetic data. As such, the framework can be especially useful for sparse dataset domains such as environmental science, robotics, cybersecurity, and cultural heritage and arts.
Problem

Research questions and friction points this paper is trying to address.

Analyze NHANES data to identify hospital utilization risk factors
Integrate CI/CD practices in data science for healthcare analysis
Propose scalable cloud DevOps framework for multi-domain application
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cloud-based DevOps framework for NHANES data analysis
Modular structure adaptable to non-healthcare domains
Online data update method with real and synthetic data
🔎 Similar Papers
No similar papers found.