Uncertainty Reliability Under Domain Shift: An Investigation for Data-Driven Blood Pressure Estimation in Photoplethysmography

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

187K/year
🤖 AI Summary
This study addresses the unreliable uncertainty estimation of cuffless photoplethysmography (PPG)-based blood pressure prediction under out-of-distribution (OOD) scenarios. The authors systematically evaluate various deep learning architectures and uncertainty quantification strategies using an XResNet1D-50 backbone, combined with deep ensembles (DE), Monte Carlo dropout (MCD), Gaussian negative log-likelihood (GNLL), and mean squared error (MSE) losses. Post-hoc calibration techniques—including conformal prediction (CP), temperature scaling (TS), and isotonic regression (IR)—are further integrated to refine predictive uncertainty. The work reveals, for the first time, that DE exhibits superior robustness to domain shift compared to MCD, GNLL inherently yields higher-quality uncertainty estimates, and post-calibration is particularly crucial for MSE-trained models. Experimental results demonstrate that GNLL combined with DE, followed by CP or TS calibration, achieves the best-calibrated uncertainty, with CP and TS consistently delivering the most substantial performance improvements.
📝 Abstract
Uncertainty quantification (UQ) is critical for safety-critical domains like healthcare, yet it is rarely evaluated under realistic out-of-distribution (OOD) conditions. Here, we assessed predictive performance and uncertainty reliability for deep learning-based blood pressure (BP) estimation from photoplethysmography (PPG) signals under both in-distribution (ID) and OOD settings. Using an XResNet1D-50 trained on PulseDB and tested on four external datasets, we compared deep ensembles (DE) and Monte Carlo dropout (MCD) with Gaussian negative log-likelihood (GNLL) and mean squared error (MSE) losses, optionally followed by post-hoc recalibration via conformal prediction (CP), temperature scaling (TS), and isotonic regression (IR). The key findings of our study are as follows: (1) DE provides stronger predictive robustness under domain shift than MCD, an advantage that becomes clear primarily under external shift. (2) Recalibrated GNLL-based methods yield the best uncertainty calibration (e.g., GNLL+DE+CP for systolic blood pressure (SBP), GNLL+DE+TS for diastolic blood pressure (DBP)), while MSE-based uncertainty requires recalibration to become practically useful. (3) Across settings, CP and TS offer the most consistent gains, with IR remaining competitive in several cases. Overall, our results identify DE-based methods as most robust for predictive performance under domain shift, GNLL as strongest for native UQ, and recalibration as essential for making MSE-based uncertainty practical. These findings highlight the need to jointly assess predictive accuracy and calibration on external data for trustworthy cuffless BP estimation
Problem

Research questions and friction points this paper is trying to address.

uncertainty quantification
domain shift
blood pressure estimation
photoplethysmography
out-of-distribution
Innovation

Methods, ideas, or system contributions that make the work stand out.

uncertainty quantification
domain shift
blood pressure estimation
deep ensembles
conformal prediction