Machine Learning based Analysis for Radiomics Features Robustness in Real-World Deployment Scenarios

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Radiomics models exhibit poor robustness in clinical deployment due to distribution shifts induced by variations in MRI acquisition protocols, anatomical positioning, and segmentation. Method: We propose a “protocol-invariant feature selection” strategy, leveraging multi-sequence MRI (T2-HASTE/T2-TSE) and controlled fruit phantom experiments to systematically evaluate and enhance cross-protocol generalizability and uncertainty quantification. Our approach integrates XGBoost classification, data augmentation, temperature scaling for calibration, and dedicated ablation studies isolating protocol and segmentation variability. Contribution/Results: An 8-feature model achieves in-domain and out-of-domain F1 scores >0.85, with <40% performance degradation under protocol shift. Data augmentation reduces expected calibration error by 35%, markedly improving predictive reliability. This work establishes a verifiable evaluation framework and practical optimization pathway for robust clinical deployment of radiomics models.

Technology Category

Application Category

📝 Abstract
Radiomics-based machine learning models show promise for clinical decision support but are vulnerable to distribution shifts caused by variations in imaging protocols, positioning, and segmentation. This study systematically investigates the robustness of radiomics-based machine learning models under distribution shifts across five MRI sequences. We evaluated how different acquisition protocols and segmentation strategies affect model reliability in terms of predictive power and uncertainty-awareness. Using a phantom of 16 fruits, we evaluated distribution shifts through: (1) protocol variations across T2-HASTE, T2-TSE, T2-MAP, T1-TSE, and T2-FLAIR sequences; (2) segmentation variations (full, partial, rotated); and (3) inter-observer variability. We trained XGBoost classifiers on 8 consistent robust features versus sequence-specific features, testing model performance under in-domain and out-of-domain conditions. Results demonstrate that models trained on protocol-invariant features maintain F1-scores >0.85 across distribution shifts, while models using all features showed 40% performance degradation under protocol changes. Dataset augmentation substantially improved the quality of uncertainty estimates and reduced the expected calibration error (ECE) by 35% without sacrificing accuracy. Temperature scaling provided minimal calibration benefits, confirming XGBoost's inherent reliability. Our findings reveal that protocol-aware feature selection and controlled phantom studies effectively predict model behavior under distribution shifts, providing a framework for developing robust radiomics models resilient to real-world protocol variations.
Problem

Research questions and friction points this paper is trying to address.

Evaluating radiomics model robustness under MRI protocol variations
Assessing segmentation strategy impacts on predictive reliability
Developing protocol-invariant features to maintain performance across distributions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Used protocol-invariant features for robust models
Applied dataset augmentation to improve uncertainty estimates
Employed phantom studies to predict model behavior
🔎 Similar Papers
No similar papers found.
S
Sarmad Ahmad Khan
German Cancer Consortium (DKTK), partner site Frankfurt/Mainz, a partnership between DKFZ and UCT Frankfurt- Marburg, Germany; German Cancer Research Center (DKFZ), Heidelberg, Germany; Goethe University Frankfurt, Germany
S
Simon Bernatz
Goethe University Frankfurt, Germany
Zahra Moslehi
Zahra Moslehi
German Cancer Consortium (DKTK), partner site Frankfurt/Mainz, a partnership between DKFZ and UCT Frankfurt- Marburg, Germany; Goethe University Frankfurt, Germany
Florian Buettner
Florian Buettner
Frankfurt University/DKFZ