AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan Datasets

๐Ÿ“… 2024-05-07
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 2
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address poor generalizability and limited robustness of LDCT-based AI models in early lung cancer diagnosis, this work introduces the first cross-dataset AI benchmark framework for pulmonary nodule detection and lung cancer diagnosis, systematically evaluating 3D models on the open DLCS dataset (2,000+ cases). We propose Strategic Warm-Start++ (SWS++), a task-adaptive pretraining method that leverages detection-generated candidate lesion patches to jointly optimize nodule detection and malignancy classification. We further establish a unified Clinical Performance Metric (CPM) evaluation standard and publicly release all code, pretrained models, and expert annotations. SWS++ achieves AUCs of 0.71โ€“0.90 across multi-center clinical datasetsโ€”matching or surpassing state-of-the-art self-supervised pretraining methods including Models Genesis and Med3D. Moreover, our detection models demonstrate superior generalization on external benchmarks such as LUNA16 and NLST-3D+.

Technology Category

Application Category

๐Ÿ“ Abstract
Lung cancer remains the leading cause of cancer-related mortality worldwide, and early detection through low-dose computed tomography (LDCT) has shown significant promise in reducing death rates. With the growing integration of artificial intelligence (AI) into medical imaging, the development and evaluation of robust AI models require access to large, well-annotated datasets. In this study, we introduce the utility of Duke Lung Cancer Screening (DLCS) Dataset, the largest open-access LDCT dataset with over 2,000 scans and 3,000 expert-verified nodules. We benchmark deep learning models for both 3D nodule detection and lung cancer classification across internal and external datasets including LUNA16, LUNA25, and NLST-3D+. For detection, we develop two MONAI-based RetinaNet models (DLCSDmD and LUNA16-mD), evaluated using the Competition Performance Metric (CPM). For classification, we compare five models, including state-of-the-art pretrained models (Models Genesis, Med3D), a selfsupervised foundation model (FMCB), a randomly initialized ResNet50, and proposed a novel Strategic Warm-Start++ (SWS++) model. SWS++ uses curated candidate patches to pretrain a classification backbone within the same detection pipeline, enabling task-relevant feature learning. Our models demonstrated strong generalizability, with SWS++ achieving comparable or superior performance to existing foundational models across multiple datasets (AUC: 0.71 to 0.90). All code, models, and data are publicly released to promote reproducibility and collaboration. This work establishes a standardized benchmarking resource for lung cancer AI research, supporting future efforts in model development, validation, and clinical translation.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking AI models for lung nodule detection in CT scans
Evaluating deep learning models for lung cancer classification accuracy
Providing open-access datasets and models for reproducibility in lung cancer AI research
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes largest open-access LDCT dataset DLCS
Benchmarks MONAI-based RetinaNet for nodule detection
Introduces Strategic Warm-Start++ for classification
๐Ÿ”Ž Similar Papers
No similar papers found.
F
F. Tushar
Dept. of Electrical & Computer Engineering, Pratt School of Engineering, Duke University, Durham; Center for Virtual Imaging Trials, Carl E. Ravin Advanced Imaging Laboratories, Department of Radiology, Duke University School of Medicine, Durham, NC
A
Avivah J Wang
Duke University School of Medicine, Durham, NC
Lavsen Dahal
Lavsen Dahal
Duke University
Deep LearningMedical ImagingComputer Vision
M
Michael R. Harowicz
Division of Cardiothoracic Imaging, Department of Radiology, Duke University School of Medicine, Durham, NC
K
Kyle J Lafata
Dept. of Electrical & Computer Engineering, Pratt School of Engineering, Duke University, Durham; Center for Virtual Imaging Trials, Carl E. Ravin Advanced Imaging Laboratories, Department of Radiology, Duke University School of Medicine, Durham, NC
T
Tina D Tailor
Division of Cardiothoracic Imaging, Department of Radiology, Duke University School of Medicine, Durham, NC
Joseph Y. Lo
Joseph Y. Lo
Professor of Radiology, Biomed. Engineering, Elec. Engineering, Med Physics
medical imagingmachine learning