Confidence Matters: Uncertainty Quantification and Precision Assessment of Deep Learning-based CMR Biomarker Estimates Using Scan-rescan Data

📅 2026-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses a critical limitation in current deep learning approaches for cardiac magnetic resonance (CMR) biomarker estimation, which predominantly emphasize point estimate accuracy while neglecting scan–rescan consistency and precision. The authors introduce, for the first time, a distribution-level uncertainty quantification framework that integrates deep ensembles, test-time augmentation, and Monte Carlo Dropout. They propose novel metrics based on confidence interval overlap and statistical similarity tests to systematically evaluate biomarker reproducibility. Experimental results demonstrate that, despite achieving an average Dice coefficient of 87% on an external validation set, fewer than 45% of cases exhibit scan–rescan confidence interval overlap exceeding 50%, and over 65% show statistically significant differences—revealing that conventional point estimates substantially overstate real-world precision.
📝 Abstract
The performance of deep learning (DL) methods for the analysis of cine cardiovascular magnetic resonance (CMR) is typically assessed in terms of accuracy, overlooking precision. In this work, uncertainty estimation techniques, namely deep ensemble, test-time augmentation, and Monte Carlo dropout, are applied to a state-of-the-art DL pipeline for cardiac functional biomarker estimation, and new distribution-based metrics are proposed for the assessment of biomarker precision. The model achieved high accuracy (average Dice 87%) and point estimate precision on two external validation scan-rescan CMR datasets. However, distribution-based metrics showed that the overlap between scan/rescan confidence intervals was >50% in less than 45% of the cases. Statistical similarity tests between scan and rescan biomarkers also resulted in significant differences for over 65% of the cases. We conclude that, while point estimate metrics might suggest good performance, distributional analyses reveal lower precision, highlighting the need to use more representative metrics to assess scan-rescan agreement.
Problem

Research questions and friction points this paper is trying to address.

uncertainty quantification
precision assessment
cardiovascular magnetic resonance
scan-rescan agreement
biomarker estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

uncertainty quantification
precision assessment
scan-rescan agreement
deep learning
distribution-based metrics
🔎 Similar Papers
No similar papers found.
D
Dewmini Hasara Wickremasinghe
School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK
M
Michelle Gibogwe
School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK
Andrew Bell
Andrew Bell
New York University
artificial intelligencemachine learningexplainabilityfairness
Esther Puyol-Antón
Esther Puyol-Antón
Senior Research Scientist, HeartFlow
Medical Image AnalysisMachine LearningComputer Vision
M
Muhummad Sohaib Nazir
School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK; Cardio-Oncology Centre of Excellence, Royal Brompton Hospitals, London, United Kingdom
R
Reza Razavi
School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK
B
Bruno Paun
Perspectum Ltd., Oxford, UK
Paul Aljabar
Paul Aljabar
Perspectum Ltd
Biomedical Image AnalysisMachine LearningMedical Imaging
A
Andrew P. King
School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK