🤖 AI Summary
This work addresses the scarcity of expert annotations in coronary computed tomography angiography (CCTA) and the limited ability of existing self-supervised methods to model localized pathological features by proposing CORA, a pathology synthesis–driven 3D vision foundation model. CORA leverages an anatomy-guided lesion synthesis engine to enable self-supervised pretraining on unlabeled CCTA data, explicitly focusing on critical pathological characteristics such as plaques. Notably, it is the first to integrate a large language model for multimodal major adverse cardiac event (MACE) risk stratification. Trained on 12,801 multicenter CCTA scans, CORA substantially outperforms current 3D foundation models, achieving up to a 29% improvement in downstream task performance and significantly enhancing the accuracy of 30-day MACE risk prediction.
📝 Abstract
Coronary artery disease, the leading cause of cardiovascular mortality worldwide, can be assessed non-invasively by coronary computed tomography angiography (CCTA). Despite progress in automated CCTA analysis using deep learning, clinical translation is constrained by the scarcity of expert-annotated datasets. Furthermore, widely adopted label-free pretraining strategies, such as masked image modeling, are intrinsically biased toward global anatomical statistics, frequently failing to capture the spatially localized pathological features of coronary plaques. Here, we introduce CORA, a 3D vision foundation model for comprehensive cardiovascular risk assessment. CORA learns directly from volumetric CCTA via a pathology-centric, synthesis-driven self-supervised framework. By utilizing an anatomy-guided lesion synthesis engine, the model is explicitly trained to detect simulated vascular abnormalities, biasing representation learning toward clinically relevant disease features rather than dominant background anatomy. We trained CORA on a large-scale cohort of 12,801 unlabeled CCTA volumes and comprehensively evaluated the model across multi-center datasets from nine independent hospitals. Across diagnostic and anatomical tasks, including plaque characterization, stenosis detection, and coronary artery segmentation, CORA consistently outperformed the state-of-the-art 3D vision foundation models, achieving up to a 29\% performance gain. Crucially, by coupling the imaging encoder with a large language model, we extended CORA into a multimodal framework that significantly improved 30-day major adverse cardiac event (MACE) risk stratification. Our results establish CORA as a scalable and extensible foundation for unified anatomical assessment and cardiovascular risk prediction.