🤖 AI Summary
This study addresses the dynamic prediction of methicillin-resistant *Staphylococcus aureus* (MRSA) healthcare-associated infection (HAI) risk during individual hospitalizations. We propose GenHAI, the first generative probabilistic model designed specifically for single-admission longitudinal clinical test sequences. Built upon a probabilistic programming framework, GenHAI pioneers the application of generative probabilistic modeling to HAI risk prediction, enabling predictive, causal, and counterfactual inference with strong interpretability and flexible inference capabilities. Evaluated on two real-world clinical datasets, GenHAI significantly outperforms state-of-the-art discriminative and conventional generative models, achieving ≥8.2% AUC improvement. Its transparent, mechanistic structure supports clinically actionable, interpretable, and verifiable decision-making for early risk stratification and targeted intervention planning, demonstrating clear translational potential in hospital infection prevention and control.
📝 Abstract
The US Centers for Disease Control and Prevention (CDC), in 2019, designated Methicillin-resistant Staphylococcus aureus (MRSA) as a serious antimicrobial resistance threat. The risk of acquiring MRSA and suffering life-threatening consequences due to it remains especially high for hospitalized patients due to a unique combination of factors, including: co-morbid conditions, immuno suppression, antibiotic use, and risk of contact with contaminated hospital workers and equipment. In this paper, we present a novel generative probabilistic model, GenHAI, for modeling sequences of MRSA test results outcomes for patients during a single hospitalization. This model can be used to answer many important questions from the perspectives of hospital administrators for mitigating the risk of MRSA infections. Our model is based on the probabilistic programming paradigm, and can be used to approximately answer a variety of predictive, causal, and counterfactual questions. We demonstrate the efficacy of our model by comparing it against discriminative and generative machine learning models using two real-world datasets.