Efficient 2D CT Foundation Model for Contrast Phase Classification

📅 2025-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses key challenges in CT contrast-phase classification—namely, high computational cost of 3D models, strong dependence on labeled data, and poor cross-center generalizability—by proposing the first 2D vision foundation model tailored for CT phase identification. Methodologically, it leverages a DeepLesion-pretrained 2D Vision Transformer (ViT) to extract robust single-slice CT embeddings, coupled with a lightweight classification head, and validates performance on multi-center datasets VinDr and WAW-TACE. It is the first systematic demonstration that 2D foundation models significantly outperform 3D CNNs, ResNet3D, and SlowFast in cross-domain robustness, training efficiency, and labeling efficiency. Results show F1 scores of 99.2%/94.2%/93.1% for non-contrast/arterial/venous phases on VinDr, and AUROC of 91.0%/85.6% for non-contrast/arterial phases on WAW-TACE—surpassing all 3D baselines while accelerating training by 3.2×. This work establishes a new paradigm for low-cost, highly generalizable foundation models in medical imaging.

Technology Category

Application Category

📝 Abstract
Purpose: The purpose of this study is to harness the efficiency of a 2D foundation model to develop a robust phase classifier that is resilient to domain shifts. Materials and Methods: This retrospective study utilized three public datasets from separate institutions. A 2D foundation model was trained on the DeepLesion dataset (mean age: 51.2, s.d.: 17.6; 2398 males) to generate embeddings from 2D CT slices for downstream contrast phase classification. The classifier was trained on the VinDr Multiphase dataset and externally validated on the WAW-TACE dataset. The 2D model was also compared to three 3D supervised models. Results: On the VinDr dataset (146 male, 63 female, 56 unidentified), the model achieved near-perfect AUROC scores and F1 scores of 99.2%, 94.2%, and 93.1% for non-contrast, arterial, and venous phases, respectively. The `Other' category scored lower (F1: 73.4%) due to combining multiple contrast phases into one class. On the WAW-TACE dataset (mean age: 66.1, s.d.: 10.0; 185 males), the model showed strong performance with AUROCs of 91.0% and 85.6%, and F1 scores of 87.3% and 74.1% for non-contrast and arterial phases. Venous phase performance was lower, with AUROC and F1 scores of 81.7% and 70.2% respectively, due to label mismatches. Compared to 3D supervised models, the approach trained faster, performed as well or better, and showed greater robustness to domain shifts. Conclusion: The robustness of the 2D Foundation model may be potentially useful for automation of hanging protocols and data orchestration for clinical deployment of AI algorithms.
Problem

Research questions and friction points this paper is trying to address.

2D CT Scan Model
Contrast Agent Discrimination
Medical Diagnosis
Innovation

Methods, ideas, or system contributions that make the work stand out.

2D Model
Robust Classification
AI Automation in Healthcare
🔎 Similar Papers
No similar papers found.
Benjamin Hou
Benjamin Hou
Imperial College London
Machine LearningMedical Image AnalysisNatural Language Processing
T
T. Mathai
Imaging Biomarkers and Computer Aided Diagnosis Lab, Clinical Center - National Institutes of Health, Bethesda, MD, USA.
Pritam Mukherjee
Pritam Mukherjee
National Institutes of Health Clinical Center
machine learning for healthcaremedical imaging
Xinya Wang
Xinya Wang
National Institutes of Health
R
Ronald M. Summers
Imaging Biomarkers and Computer Aided Diagnosis Lab, Clinical Center - National Institutes of Health, Bethesda, MD, USA.
Zhiyong Lu
Zhiyong Lu
Senior Investigator, NLM; Adjunct Professor of CS, UIUC
BioNLPBiomedical InformaticsMedical AIArtificial Intelligence