🤖 AI Summary
Clinical data exhibit high heterogeneity and lack automated early-warning systems, impeding timely identification of critical events. To address this, we propose EWI—a multimodal machine learning framework that jointly models structured electronic health records (EHRs) and unstructured clinical notes, and—novelly—integrates operational data including surgical scheduling and ward occupancy alongside clinical indicators to enable risk attribution. EWI employs SHAP-based interpretability analysis, clinician-in-the-loop threshold calibration, and a real-time, three-tier risk stratification dashboard, establishing a human-AI collaborative early-warning paradigm. Evaluated on 18,633 patient admissions, EWI achieves a C-statistic of 0.796 for composite prediction of ICU transfer, rapid response activation, or in-hospital mortality. Deployed institutionally as a triage support tool, EWI significantly improves clinical response timeliness and resource allocation precision.
📝 Abstract
Hospitals lack automated systems to harness the growing volume of heterogeneous clinical and operational data to effectively forecast critical events. Early identification of patients at risk for deterioration is essential not only for patient care quality monitoring but also for physician care management. However, translating varied data streams into accurate and interpretable risk assessments poses significant challenges due to inconsistent data formats. We develop a multimodal machine learning framework, the Early Warning Index (EWI), to predict the aggregate risk of ICU admission, emergency response team dispatch, and mortality. Key to EWI's design is a human-in-the-loop process: clinicians help determine alert thresholds and interpret model outputs, which are enhanced by explainable outputs using Shapley Additive exPlanations (SHAP) to highlight clinical and operational factors (e.g., scheduled surgeries, ward census) driving each patient's risk. We deploy EWI in a hospital dashboard that stratifies patients into three risk tiers. Using a dataset of 18,633 unique patients at a large U.S. hospital, our approach automatically extracts features from both structured and unstructured electronic health record (EHR) data and achieves C-statistics of 0.796. It is currently used as a triage tool for proactively managing at-risk patients. The proposed approach saves physicians valuable time by automatically sorting patients of varying risk levels, allowing them to concentrate on patient care rather than sifting through complex EHR data. By further pinpointing specific risk drivers, the proposed model provides data-informed adjustments to caregiver scheduling and allocation of critical resources. As a result, clinicians and administrators can avert downstream complications, including costly procedures or high readmission rates and improve overall patient flow.