DeepCORO-CLIP: A Multi-View Foundation Model for Comprehensive Coronary Angiography Video-Text Analysis and External Validation

📅 2026-03-18

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This study addresses the inter-observer variability in manual interpretation of conventional coronary angiography and limitations of existing AI approaches, which often focus on single-frame or single-view analysis and solely on stenosis detection. The authors propose DeepCORO-CLIP, the first method to introduce multi-view video–text contrastive learning into coronary angiography analysis. By employing attention-based pooling to fuse multi-angle angiographic videos, the model enables comprehensive lesion identification—including stenosis, chronic total occlusion, thrombus, and calcification—and supports cross-task transfer learning for diagnosis, prognosis, and disease progression assessment. In internal and external evaluations, DeepCORO-CLIP achieved AUROCs of 0.888 and 0.890 for significant stenosis detection, outperforming clinical reports in quantitative accuracy. It also predicted one-year major adverse cardiovascular events (AUROC: 0.79) and left ventricular ejection fraction (MAE: 7.3%) within 4.2 seconds, demonstrating robust generalization across multicenter datasets.

Technology Category

Application Category

📝 Abstract

Coronary angiography is the reference standard for evaluating coronary artery disease, yet visual interpretation remains variable between readers. Existing artificial intelligence methods typically analyze single frames or projections and focus mainly on stenosis, limiting comprehensive coronary assessment. We present DeepCORO-CLIP, a multi-view foundation model trained with video-text contrastive learning on 203,808 angiography videos from 28,117 patients across 32,473 studies at the Montreal Heart Institute and externally validated on 4,249 studies from the University of California, San Francisco. DeepCORO-CLIP integrates multiple projections with attention-based pooling for study-level assessment across diagnostic, prognostic, and disease progression tasks. For significant stenosis detection, the model achieved an AUROC of 0.888 internally and 0.89 on external validation. Mean absolute error against core laboratory quantitative coronary angiography was 13.6%, lower than clinical reports at 19.0%. The model also performed strongly for chronic total occlusion, intracoronary thrombus, and coronary calcification detection. Transfer learning enabled prediction of one-year major adverse cardiovascular events with AUROC 0.79 and estimation of left ventricular ejection fraction with mean absolute error 7.3%. Embeddings also captured disease progression across serial examinations. With a mean inference time of 4.2 seconds in hospital deployment, DeepCORO-CLIP provides a foundation for automated coronary angiography interpretation at the point of care. Code, sample data, model weights, and deployment infrastructure are publicly released.

Problem

Research questions and friction points this paper is trying to address.

coronary angiography

interpretation variability

comprehensive assessment

stenosis detection

multi-view analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-view foundation model

video-text contrastive learning

coronary angiography analysis