๐ค AI Summary
This study addresses the challenges of data noise, clinical bias, and model opacity in electronic health records (EHRs) that hinder reliable clinical decision support. We propose Trust-MAPSโa novel framework that formalizes clinical knowledge as high-dimensional physiological constraints and formulates a mixed-integer programming model to achieve trustworthy EHR data projection and quantitative anomaly detection. The framework generates clinically meaningful, interpretable โtrust-scoreโ features for enhancing downstream predictive models. Methodologically, Trust-MAPS integrates trust-aware data correction, constraint-driven feature engineering, and an XGBoost classifier, augmented with SMOTE to mitigate class imbalance. Evaluated on early sepsis prediction, it achieves an AUROC of 0.91 (95% CI: 0.89โ0.92), outperforming baseline methods by 15%. Results demonstrate its unified efficacy in error correction, bias mitigation, and model interpretability.
๐ Abstract
The objective of this work is to develop an Electronic Medical Record (EMR) data processing tool that confers clinical context to Machine Learning (ML) algorithms for error handling, bias mitigation and interpretability. We present Trust-MAPS, an algorithm that translates clinical domain knowledge into high-dimensional, mixed-integer programming models that capture physiological and biological constraints on clinical measurements. EMR data is projected onto this constrained space, effectively bringing outliers to fall within a physiologically feasible range. We then compute the distance of each data point from the constrained space modeling healthy physiology to quantify deviation from the norm. These distances, termed"trust-scores,"are integrated into the feature space for downstream ML applications. We demonstrate the utility of Trust-MAPS by training a binary classifier for early sepsis prediction on data from the 2019 PhysioNet Computing in Cardiology Challenge, using the XGBoost algorithm and applying SMOTE for overcoming class-imbalance. The Trust-MAPS framework shows desirable behavior in handling potential errors and boosting predictive performance. We achieve an AUROC of 0.91 (0.89, 0.92 : 95% CI) for predicting sepsis 6 hours before onset - a marked 15% improvement over a baseline model trained without Trust-MAPS. Trust-scores emerge as clinically meaningful features that not only boost predictive performance for clinical decision support tasks, but also lend interpretability to ML models. This work is the first to translate clinical domain knowledge into mathematical constraints, model cross-vital dependencies, and identify aberrations in high-dimensional medical data. Our method allows for error handling in EMR, and confers interpretability and superior predictive power to models trained for clinical decision support.