🤖 AI Summary
Predicting overall survival (OS) in non-small cell lung cancer (NSCLC) patients from multicenter CT data remains challenged by regional heterogeneity and batch effects.
Method: We propose a multi-regional radiomic framework jointly modeling features from the whole lung, primary tumor, and mediastinal lymph nodes, integrating handcrafted radiomics, foundation model (FM)-derived deep features, and clinical variables. We introduce a novel two-stage normalization combining ComBat harmonization with reconstruction kernel normalization (RKN), and develop a SHAP-guided consensus risk stratification model.
Contribution/Results: The FM-plus-clinical model achieves a C-index of 0.7616 and time-dependent AUC (t-AUC) of 0.8866; the consensus model attains t-AUC = 0.92, sensitivity = 97.6%, outperforming TNM staging and covering 78% of test samples. This work establishes a new paradigm for interpretable, robust, multicenter OS prediction in NSCLC.
📝 Abstract
Purpose: To evaluate the impact of harmonization and multi-region CT image feature integration on survival prediction in non-small cell lung cancer (NSCLC) patients, using handcrafted radiomics, pretrained foundation model (FM) features, and clinical data from a multicenter dataset. Methods: We analyzed CT scans and clinical data from 876 NSCLC patients (604 training, 272 test) across five centers. Features were extracted from the whole lung, tumor, mediastinal nodes, coronary arteries, and coronary artery calcium (CAC). Handcrafted radiomics and FM deep features were harmonized using ComBat, reconstruction kernel normalization (RKN), and RKN+ComBat. Regularized Cox models predicted overall survival; performance was assessed using the concordance index (C-index), 5-year time-dependent area under the curve (t-AUC), and hazard ratio (HR). SHapley Additive exPlanations (SHAP) values explained feature contributions. A consensus model used agreement across top region of interest (ROI) models to stratify patient risk. Results: TNM staging showed prognostic utility (C-index = 0.67; HR = 2.70; t-AUC = 0.85). The clinical + tumor radiomics model with ComBat achieved a C-index of 0.7552 and t-AUC of 0.8820. FM features (50-voxel cubes) combined with clinical data yielded the highest performance (C-index = 0.7616; t-AUC = 0.8866). An ensemble of all ROIs and FM features reached a C-index of 0.7142 and t-AUC of 0.7885. The consensus model, covering 78% of valid test cases, achieved a t-AUC of 0.92, sensitivity of 97.6%, and specificity of 66.7%. Conclusion: Harmonization and multi-region feature integration improve survival prediction in multicenter NSCLC data. Combining interpretable radiomics, FM features, and consensus modeling enables robust risk stratification across imaging centers.