🤖 AI Summary
Colorectal cancer (CRC) risk assessment has long relied on subjective, low-reproducibility pathological descriptions while neglecting rich multimodal clinical data, leading to inconsistent surveillance decisions. To address this, we propose a multimodal deep learning framework that jointly models digitized histopathology images and electronic health records (EHRs). Our method introduces intermediate clinical variable prediction as an auxiliary task within a pathology image Transformer architecture, enhancing representation learning. Furthermore, we design an end-to-end fusion paradigm that integrates imaging and non-imaging features without requiring manual pathology review. Evaluated on a real-world cohort, our model achieves an AUC of 0.630 for 5-year CRC risk prediction—significantly outperforming conventional approaches based solely on endoscopy and pathology reports. The framework delivers clinically interpretable, deployable decision support for precision colonoscopic surveillance.
📝 Abstract
Colonoscopy screening is an effective method to find and remove colon polyps before they can develop into colorectal cancer (CRC). Current follow-up recommendations, as outlined by the U.S. Multi-Society Task Force for individuals found to have polyps, primarily rely on histopathological characteristics, neglecting other significant CRC risk factors. Moreover, the considerable variability in colorectal polyp characterization among pathologists poses challenges in effective colonoscopy follow-up or surveillance. The evolution of digital pathology and recent advancements in deep learning provide a unique opportunity to investigate the added benefits of including the additional medical record information and automatic processing of pathology slides using computer vision techniques in the calculation of future CRC risk. Leveraging the New Hampshire Colonoscopy Registry's extensive dataset, many with longitudinal colonoscopy follow-up information, we adapted our recently developed transformer-based model for histopathology image analysis in 5-year CRC risk prediction. Additionally, we investigated various multimodal fusion techniques, combining medical record information with deep learning derived risk estimates. Our findings reveal that training a transformer model to predict intermediate clinical variables contributes to enhancing 5-year CRC risk prediction performance, with an AUC of 0.630 comparing to direct prediction. Furthermore, the fusion of imaging and non-imaging features, while not requiring manual inspection of microscopy images, demonstrates improved predictive capabilities for 5-year CRC risk comparing to variables extracted from colonoscopy procedure and microscopy findings. This study signifies the potential of integrating diverse data sources and advanced computational techniques in transforming the accuracy and effectiveness of future CRC risk assessments.