🤖 AI Summary
Acute subdural hematoma (SDH) demands rapid, accurate diagnosis in emergency settings. Method: We propose an end-to-end multimodal deep learning framework integrating 3D CT voxel modeling, Transformer-enhanced 2D segmentation, and structured clinical data (via XGBoost/MLP) to achieve high-accuracy detection and anatomically consistent pixel-level localization. Contribution/Results: We introduce a novel clinico-radiological dual-track interpretable fusion paradigm, a greedy ensemble strategy for heterogeneous model coordination, and probabilistic localization maps adhering to neuroanatomical constraints—overcoming limitations of conventional black-box models. Evaluated on 25,000 real-world non-contrast CT scans, the framework achieves an AUC of 0.9407 for SDH detection and Dice similarity of 0.82 between predicted localization maps and radiologist annotations. Detection performance improves by +0.19 AUC over unimodal baselines, enabling reliable real-time triage in acute care.
📝 Abstract
Background. Subdural hematoma (SDH) is a common neurosurgical emergency, with increasing incidence in aging populations. Rapid and accurate identification is essential to guide timely intervention, yet existing automated tools focus primarily on detection and provide limited interpretability or spatial localization. There remains a need for transparent, high-performing systems that integrate multimodal clinical and imaging information to support real-time decision-making.
Methods. We developed a multimodal deep-learning framework that integrates structured clinical variables, a 3D convolutional neural network trained on CT volumes, and a transformer-enhanced 2D segmentation model for SDH detection and localization. Using 25,315 head CT studies from Hartford HealthCare (2015--2024), of which 3,774 (14.9%) contained clinician-confirmed SDH, tabular models were trained on demographics, comorbidities, medications, and laboratory results. Imaging models were trained to detect SDH and generate voxel-level probability maps. A greedy ensemble strategy combined complementary predictors.
Findings. Clinical variables alone provided modest discriminatory power (AUC 0.75). Convolutional models trained on CT volumes and segmentation-derived maps achieved substantially higher accuracy (AUCs 0.922 and 0.926). The multimodal ensemble integrating all components achieved the best overall performance (AUC 0.9407; 95% CI, 0.930--0.951) and produced anatomically meaningful localization maps consistent with known SDH patterns.
Interpretation. This multimodal, interpretable framework provides rapid and accurate SDH detection and localization, achieving high detection performance and offering transparent, anatomically grounded outputs. Integration into radiology workflows could streamline triage, reduce time to intervention, and improve consistency in SDH management.