🤖 AI Summary
Medical image segmentation lacks reliable confidence estimation, hindering clinical deployment. To address clinical trust requirements, we propose an evidential uncertainty modeling framework: (1) a novel trainable calibration mechanism integrating subjective logic with the Dirichlet distribution; (2) an uncertainty-aware filtering module for automatic high-confidence sample selection; and (3) a differentiable uncertainty calibration loss enabling joint optimization of predictive calibration and segmentation robustness. Evaluated on ISIC2018, LiTS2017, and BraTS2019 benchmarks, our method improves segmentation accuracy (Dice score ↑1.2–2.8%) and calibration quality (Expected Calibration Error ↓35–52%). Furthermore, on real-world clinical OCT, DME, and FIVES datasets, it demonstrates effective out-of-distribution sample detection and high-quality data curation. This work establishes a new paradigm for trustworthy AI-assisted diagnosis in medical imaging.
📝 Abstract
Medical image segmentation is critical for disease diagnosis and treatment assessment. However, concerns regarding the reliability of segmentation regions persist among clinicians, mainly attributed to the absence of confidence assessment, robustness, and calibration to accuracy. To address this, we introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks. DEviS not only enhances the calibration and robustness of baseline segmentation accuracy but also provides high-efficiency uncertainty estimation for reliable predictions. By leveraging subjective logic theory, we explicitly model probability and uncertainty for the problem of medical image segmentation. Here, the Dirichlet distribution parameterizes the distribution of probabilities for different classes of the segmentation results. To generate calibrated predictions and uncertainty, we develop a trainable calibrated uncertainty penalty. Furthermore, DEviS incorporates an uncertainty-aware filtering module, which utilizes the metric of uncertainty-calibrated error to filter reliable data within the dataset. We conducted validation studies to assess both the accuracy and robustness of DEviS segmentation, along with evaluating the efficiency and reliability of uncertainty estimation. These evaluations were performed using publicly available datasets including ISIC2018, LiTS2017, and BraTS2019. Additionally, two potential clinical trials are being conducted at Johns Hopkins OCT, Duke-OCT-DME, and FIVES datasets to demonstrate their efficacy in filtering high-quality or out-of-distribution data. Our code has been released in https://github.com/Cocofeat/DEviS.