🤖 AI Summary
This work addresses the lack of efficient and reliable quality assessment for automatically generated target contours in online adaptive radiotherapy (OART). We propose a Bayesian ordinal classification (BOC) framework that operates without ground-truth contour annotations. The method integrates uncertainty quantification, geometry-derived proxy labels, and threshold calibration, enabling unsupervised or weakly supervised contour quality discrimination using only 30 manually annotated cases for fine-tuning and 34 cases for calibration. Evaluation yields >90% classification accuracy, with ≥93% of contour quality assessments correct in 98% of test cases. To our knowledge, this is the first approach to combine ordinal regression with Bayesian deep learning for contour quality evaluation, delivering clinically interpretable confidence scores. By significantly reducing manual review burden, the framework supports rapid, trustworthy OART decision-making.
📝 Abstract
Purpose: This study presents a Deep Learning (DL)-based quality assessment (QA) approach for evaluating auto-generated contours (auto-contours) in radiotherapy, with emphasis on Online Adaptive Radiotherapy (OART). Leveraging Bayesian Ordinal Classification (BOC) and calibrated uncertainty thresholds, the method enables confident QA predictions without relying on ground truth contours or extensive manual labeling. Methods: We developed a BOC model to classify auto-contour quality and quantify prediction uncertainty. A calibration step was used to optimize uncertainty thresholds that meet clinical accuracy needs. The method was validated under three data scenarios: no manual labels, limited labels, and extensive labels. For rectum contours in prostate cancer, we applied geometric surrogate labels when manual labels were absent, transfer learning when limited, and direct supervision when ample labels were available. Results: The BOC model delivered robust performance across all scenarios. Fine-tuning with just 30 manual labels and calibrating with 34 subjects yielded over 90% accuracy on test data. Using the calibrated threshold, over 93% of the auto-contours' qualities were accurately predicted in over 98% of cases, reducing unnecessary manual reviews and highlighting cases needing correction. Conclusion: The proposed QA model enhances contouring efficiency in OART by reducing manual workload and enabling fast, informed clinical decisions. Through uncertainty quantification, it ensures safer, more reliable radiotherapy workflows.