Benchmarking Deep Learning Models for Laryngeal Cancer Staging Using the LaryngealCT Dataset

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Longstanding limitations in laryngeal cancer imaging analysis stem from the absence of standardized, publicly available CT datasets, hindering reproducible deep learning research. To address this, we introduce LaryngealCT—the first open-source benchmark dataset for laryngeal T-staging—comprising 1,029 contrast-enhanced CT scans. We propose a weakly supervised region-of-interest (ROI) extraction method and integrate 3D Grad-CAM for interpretable model analysis, revealing attention patterns around critical anatomical structures, particularly the laryngeal cartilages. In binary classification tasks—early (Tis–T2) versus late (T3–T4) staging, and T4 versus non-T4—we achieve state-of-the-art AUCs of 0.881 and 0.892 using a 3D CNN and ResNet18, respectively. All models undergo rigorous validation by clinical experts. The dataset, source code, and trained models are fully open-sourced to foster transparency and reproducibility in laryngeal cancer AI research.

Technology Category

Application Category

📝 Abstract
Laryngeal cancer imaging research lacks standardised datasets to enable reproducible deep learning (DL) model development. We present LaryngealCT, a curated benchmark of 1,029 computed tomography (CT) scans aggregated from six collections from The Cancer Imaging Archive (TCIA). Uniform 1 mm isotropic volumes of interest encompassing the larynx were extracted using a weakly supervised parameter search framework validated by clinical experts. 3D DL architectures (3D CNN, ResNet18,50,101, DenseNet121) were benchmarked on (i) early (Tis,T1,T2) vs. advanced (T3,T4) and (ii) T4 vs. non-T4 classification tasks. 3D CNN (AUC-0.881, F1-macro-0.821) and ResNet18 (AUC-0.892, F1-macro-0.646) respectively outperformed the other models in the two tasks. Model explainability assessed using 3D GradCAMs with thyroid cartilage overlays revealed greater peri-cartilage attention in non-T4 cases and focal activations in T4 predictions. Through open-source data, pretrained models, and integrated explainability tools, LaryngealCT offers a reproducible foundation for AI-driven research to support clinical decisions in laryngeal oncology.
Problem

Research questions and friction points this paper is trying to address.

Standardizing datasets for reproducible laryngeal cancer deep learning models
Benchmarking 3D architectures for early versus advanced cancer staging
Providing explainable AI tools for clinical decision support in oncology
Innovation

Methods, ideas, or system contributions that make the work stand out.

Used weakly supervised parameter search for CT scan extraction
Benchmarked 3D deep learning models for cancer staging
Integrated explainability tools with 3D GradCAMs for predictions
🔎 Similar Papers
No similar papers found.
N
Nivea Roy
School of Information Technology, Deakin University, VIC, Australia
Son Tran
Son Tran
Senior Principal Scientist, Amazon
Computer VisionMachine LearningDeep LearningVideo Processing
Atul Sajjanhar
Atul Sajjanhar
Deakin University
education technologiespattern recognitionmachine learningdeep learningIOT
K
K. Devaraja
Department of Head and Neck Surgery, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
P
Prakashini Koteshwara
Department of Radiodiagnosis and Imaging, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
Yong Xiang
Yong Xiang
School of Information Technology, Deakin University
Cybersecuritydata sciencemachine learning & AIdistributed computingcommun. engineering
D
Divya Rao
Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India