🤖 AI Summary
Low automation in lung cancer prognosis analysis from CT imaging and clinicians’ limited trust in AI-generated tumor segmentations hinder clinical adoption.
Method: We propose a clinician-centered participatory AI framework integrating click-guided semi-automatic segmentation, multi-model comparison (VNet, 3D Attention U-Net, etc.), PySERA-based radiomics analysis, and semi-supervised learning—validated across multi-center datasets.
Contribution/Results: VNet achieved superior performance: segmentation Dice score = 0.83, radiomic feature stability (ICC = 0.65), and semi-supervised classification accuracy = 0.88 (F1 = 0.83). Six radiologists independently rated VNet’s peritumoral structural characterization and boundary smoothness as most clinically credible. The framework significantly enhances model interpretability, clinical acceptability, and real-world translatability—bridging the gap between AI development and routine clinical practice.
📝 Abstract
Lung cancer remains the leading cause of cancer mortality, with CT imaging central to screening, prognosis, and treatment. Manual segmentation is variable and time-intensive, while deep learning (DL) offers automation but faces barriers to clinical adoption. Guided by the Knowledge-to-Action framework, this study develops a clinician-in-the-loop DL pipeline to enhance reproducibility, prognostic accuracy, and clinical trust. Multi-center CT data from 999 patients across 12 public datasets were analyzed using five DL models (3D Attention U-Net, ResUNet, VNet, ReconNet, SAM-Med3D), benchmarked against expert contours on whole and click-point cropped images. Segmentation reproducibility was assessed using 497 PySERA-extracted radiomic features via Spearman correlation, ICC, Wilcoxon tests, and MANOVA, while prognostic modeling compared supervised (SL) and semi-supervised learning (SSL) across 38 dimensionality reduction strategies and 24 classifiers. Six physicians qualitatively evaluated masks across seven domains, including clinical meaningfulness, boundary quality, prognostic value, trust, and workflow integration. VNet achieved the best performance (Dice = 0.83, IoU = 0.71), radiomic stability (mean correlation = 0.76, ICC = 0.65), and predictive accuracy under SSL (accuracy = 0.88, F1 = 0.83). SSL consistently outperformed SL across models. Radiologists favored VNet for peritumoral representation and smoother boundaries, preferring AI-generated initial masks for refinement rather than replacement. These results demonstrate that integrating VNet with SSL yields accurate, reproducible, and clinically trusted CT-based lung cancer prognosis, highlighting a feasible path toward physician-centered AI translation.