🤖 AI Summary
This study addresses the challenge of early prediction of graft-versus-host disease (GVHD) following liver transplantation. We propose a deep learning framework that dynamically fuses heterogeneous, irregular, high-missing-rate, and severely class-imbalanced preoperative electronic health record (EHR) data—including demographics, laboratory tests, diagnoses, and medications. Our method introduces a modality-adaptive fusion mechanism and an AUC-oriented loss function to mitigate extreme class imbalance. Evaluated on a real-world clinical dataset, the model achieves an AUC of 0.836, recall of 0.768, and specificity of 0.803—significantly outperforming unimodal and state-of-the-art multimodal baselines. To our knowledge, this is the first work to model multimodal EHRs specifically for GVHD risk prediction after liver transplantation. Its key contribution lies in jointly addressing data heterogeneity, temporal irregularity, and extreme class imbalance, delivering an interpretable and robust decision-support tool for timely clinical intervention.
📝 Abstract
Graft-versus-host disease (GVHD) is a rare but often fatal complication in liver transplantation, with a very high mortality rate. By harnessing multi-modal deep learning methods to integrate heterogeneous and imbalanced electronic health records (EHR), we aim to advance early prediction of GVHD, paving the way for timely intervention and improved patient outcomes. In this study, we analyzed pre-transplant electronic health records (EHR) spanning the period before surgery for 2,100 liver transplantation patients, including 42 cases of graft-versus-host disease (GVHD), from a cohort treated at Mayo Clinic between 1992 and 2025. The dataset comprised four major modalities: patient demographics, laboratory tests, diagnoses, and medications. We developed a multi-modal deep learning framework that dynamically fuses these modalities, handles irregular records with missing values, and addresses extreme class imbalance through AUC-based optimization. The developed framework outperforms all single-modal and multi-modal machine learning baselines, achieving an AUC of 0.836, an AUPRC of 0.157, a recall of 0.768, and a specificity of 0.803. It also demonstrates the effectiveness of our approach in capturing complementary information from different modalities, leading to improved performance. Our multi-modal deep learning framework substantially improves existing approaches for early GVHD prediction. By effectively addressing the challenges of heterogeneity and extreme class imbalance in real-world EHR, it achieves accurate early prediction. Our proposed multi-modal deep learning method demonstrates promising results for early prediction of a GVHD in liver transplantation, despite the challenge of extremely imbalanced EHR data.