🤖 AI Summary
Current ratio-based biomarkers (e.g., tumor necrosis fraction) yield only point estimates without uncertainty quantification, limiting their utility in high-stakes clinical decision-making. To address this, we propose the first end-to-end confidence-aware estimation framework. We systematically identify insufficient model calibration—rather than segmentation error propagation—as the dominant source of predictive uncertainty. Our method introduces a lightweight, plug-and-play post-calibration module enabling zero-shot hospital adaptation and incorporating an adjustable parameter *Q* for clinically customizable confidence control. By integrating soft segmentation probability propagation, temperature scaling, and quantile regression, we establish a statistically rigorous framework for interval estimation. Evaluated on multi-center oncological imaging data, our approach significantly improves confidence interval coverage and tightness—reducing average error bounds by 37%—while preserving clinical interpretability and deployment efficiency.
📝 Abstract
Ratio-based biomarkers -- such as the proportion of necrotic tissue within a tumor -- are widely used in clinical practice to support diagnosis, prognosis and treatment planning. These biomarkers are typically estimated from soft segmentation outputs by computing region-wise ratios. Despite the high-stakes nature of clinical decision making, existing methods provide only point estimates, offering no measure of uncertainty. In this work, we propose a unified extit{confidence-aware} framework for estimating ratio-based biomarkers. We conduct a systematic analysis of error propagation in the segmentation-to-biomarker pipeline and identify model miscalibration as the dominant source of uncertainty. To mitigate this, we incorporate a lightweight, post-hoc calibration module that can be applied using internal hospital data without retraining. We leverage a tunable parameter $Q$ to control the confidence level of the derived bounds, allowing adaptation towards clinical practice. Extensive experiments show that our method produces statistically sound confidence intervals, with tunable confidence levels, enabling more trustworthy application of predictive biomarkers in clinical workflows.