A Multi-Phase Analysis of Blood Culture Stewardship: Machine Learning Prediction, Expert Recommendation Assessment, and LLM Automation

📅 2025-04-09

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Overordering of blood cultures in emergency departments exacerbates healthcare burden and antibiotic misuse—particularly detrimental amid global antibiotic shortages. Method: We propose a multimodal machine learning model (XGBoost/LightGBM) integrating structured electronic health record data, clinical embeddings from physician notes extracted via ClinicalBERT/LLaMA-3, and ICD diagnosis codes to predict bacteremia risk. Contribution/Results: To our knowledge, this is the first study to systematically compare machine learning, expert consensus, and pure large language model (LLM)-based decision approaches using a multi-stage evaluation framework. Our model achieves an AUC of 0.81, maintaining high sensitivity (92%) while substantially improving specificity (75%)—significantly outperforming expert consensus (57%) and LLM-only (16%) baselines. It reduces unnecessary blood culture orders by 23%, demonstrating clinical utility for precision antimicrobial stewardship.

Technology Category

Application Category

📝 Abstract

Blood cultures are often over ordered without clear justification, straining healthcare resources and contributing to inappropriate antibiotic use pressures worsened by the global shortage. In study of 135483 emergency department (ED) blood culture orders, we developed machine learning (ML) models to predict the risk of bacteremia using structured electronic health record (EHR) data and provider notes via a large language model (LLM). The structured models AUC improved from 0.76 to 0.79 with note embeddings and reached 0.81 with added diagnosis codes. Compared to an expert recommendation framework applied by human reviewers and an LLM-based pipeline, our ML approach offered higher specificity without compromising sensitivity. The recommendation framework achieved sensitivity 86%, specificity 57%, while the LLM maintained high sensitivity (96%) but over classified negatives, reducing specificity (16%). These findings demonstrate that ML models integrating structured and unstructured data can outperform consensus recommendations, enhancing diagnostic stewardship beyond existing standards of care.

Problem

Research questions and friction points this paper is trying to address.

Predict bacteremia risk using ML and EHR data

Compare ML performance with expert recommendations

Improve blood culture ordering specificity via AI

Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine learning predicts bacteremia risk effectively

LLM enhances model accuracy with note embeddings

Combined structured and unstructured data outperforms experts

🔎 Similar Papers

No similar papers found.