Predicting Postoperative Stroke in Elderly SICU Patients: An Interpretable Machine Learning Model Using MIMIC Data

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of predicting postoperative stroke risk in elderly surgical ICU patients. We constructed a large-scale cohort of 19,085 patients from MIMIC-III/IV and developed an interpretable machine learning model using clinical data collected within the first 24 hours prior to ICU admission. Methodologically, we introduced a novel two-stage interpretable feature selection framework combining recursive feature elimination with cross-validation (RFECV) and SHAP analysis, identifying cerebrovascular disease history, serum creatinine, and systolic blood pressure as the top three modifiable risk factors. The model integrates CatBoost, iterative SVD-based imputation, ADASYN oversampling, and z-score normalization. It achieves an AUROC of 0.8868 (95% CI: 0.8802–0.8937), significantly outperforming conventional clinical scoring systems. The approach delivers both high predictive accuracy and clinical interpretability, enabling early risk stratification and informing targeted, evidence-based interventions.

Technology Category

Application Category

📝 Abstract
Postoperative stroke remains a critical complication in elderly surgical intensive care unit (SICU) patients, contributing to prolonged hospitalization, elevated healthcare costs, and increased mortality. Accurate early risk stratification is essential to enable timely intervention and improve clinical outcomes. We constructed a combined cohort of 19,085 elderly SICU admissions from the MIMIC-III and MIMIC-IV databases and developed an interpretable machine learning (ML) framework to predict in-hospital stroke using clinical data from the first 24 hours of Intensive Care Unit (ICU) stay. The preprocessing pipeline included removal of high-missingness features, iterative Singular Value Decomposition (SVD) imputation, z-score normalization, one-hot encoding, and class imbalance correction via the Adaptive Synthetic Sampling (ADASYN) algorithm. A two-stage feature selection process-combining Recursive Feature Elimination with Cross-Validation (RFECV) and SHapley Additive exPlanations (SHAP)-reduced the initial 80 variables to 20 clinically informative predictors. Among eight ML models evaluated, CatBoost achieved the best performance with an AUROC of 0.8868 (95% CI: 0.8802--0.8937). SHAP analysis and ablation studies identified prior cerebrovascular disease, serum creatinine, and systolic blood pressure as the most influential risk factors. Our results highlight the potential of interpretable ML approaches to support early detection of postoperative stroke and inform decision-making in perioperative critical care.
Problem

Research questions and friction points this paper is trying to address.

Predicting postoperative stroke risk in elderly SICU patients
Developing interpretable ML model using early ICU data
Identifying key clinical predictors for stroke prevention
Innovation

Methods, ideas, or system contributions that make the work stand out.

Interpretable ML model predicts postoperative stroke
SVD imputation and ADASYN handle data issues
CatBoost with SHAP selects key clinical features
🔎 Similar Papers
No similar papers found.
T
Tinghuan Li
Department of Industrial and Systems Engineering, University of Southern California, 3715 McClintock Ave GER 240, Los Angeles, 90087, California, United States
Shuheng Chen
Shuheng Chen
University of Southern California
Machine LearningData SciencePredictive AnalyticsClinical Prediction
Junyi Fan
Junyi Fan
University of Southern California
machine learning
E
E. Pishgar
Colorectal Research Center, Iran University of Medical Sciences, Tehran Hemat Highway next to Milad Tower, Tehran, 14535, Iran
K
K. Alaei
Department of Health Science, California State University, Long Beach (CSULB), 1250 Bellflower Blvd, Long Beach, 90840, California, United States
G
G. Placencia
Department of Industrial and Manufacturing Engineering, California State Polytechnic University, Pomona, 3801 W Temple Ave, Pomona, 91768, California, United States
M
M. Pishgar
Department of Industrial and Systems Engineering, University of Southern California, 3715 McClintock Ave GER 240, Los Angeles, 90087, California, United States