Signal Fidelity Index-Aware Calibration for Dementia Predictions Across Heterogeneous Real-World Data

📅 2025-09-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Predictive models for dementia exhibit degraded generalizability across heterogeneous electronic health record (EHR) systems due to diagnostic signal attenuation—arising from inter-institutional variations in diagnostic quality and temporal consistency. Method: We propose an unsupervised calibration framework grounded in the Signal Fidelity Index (SFI), the first patient-level metric incorporating six interpretable dimensions—including diagnostic specificity and temporal consistency—enabling label-free, adaptive assessment of data reliability. Integrating epidemiology-informed synthetic data simulation with a multiplicative calibration strategy, we optimize calibration parameters via batch-wise tuning (α = 2.0). Results: On real-world heterogeneous EHR data, SFI-aware calibration improves balanced accuracy, recall, F1-score, and detection rate by 10.3%, 32.5%, 26.1%, and 41.1%, respectively—approaching supervised baseline performance. This establishes a novel paradigm for label-efficient, cross-institutional transfer learning in clinical predictive modeling.

Technology Category

Application Category

📝 Abstract
extbf{Background:} Machine learning models trained on electronic health records (EHRs) often degrade across healthcare systems due to distributional shift. A fundamental but underexplored factor is diagnostic signal decay: variability in diagnostic quality and consistency across institutions, which affects the reliability of codes used for training and prediction. extbf{Objective:} To develop a Signal Fidelity Index (SFI) quantifying diagnostic data quality at the patient level in dementia, and to test SFI-aware calibration for improving model performance across heterogeneous datasets without outcome labels. extbf{Methods:} We built a simulation framework generating 2,500 synthetic datasets, each with 1,000 patients and realistic demographics, encounters, and coding patterns based on dementia risk factors. The SFI was derived from six interpretable components: diagnostic specificity, temporal consistency, entropy, contextual concordance, medication alignment, and trajectory stability. SFI-aware calibration applied a multiplicative adjustment, optimized across 50 simulation batches. extbf{Results:} At the optimal parameter ($α$ = 2.0), SFI-aware calibration significantly improved all metrics (p $<$ 0.001). Gains ranged from 10.3% for Balanced Accuracy to 32.5% for Recall, with notable increases in Precision (31.9%) and F1-score (26.1%). Performance approached reference standards, with F1-score and Recall within 1% and Balanced Accuracy and Detection Rate improved by 52.3% and 41.1%, respectively. extbf{Conclusions:} Diagnostic signal decay is a tractable barrier to model generalization. SFI-aware calibration provides a practical, label-free strategy to enhance prediction across healthcare contexts, particularly for large-scale administrative datasets lacking outcome labels.
Problem

Research questions and friction points this paper is trying to address.

Addressing diagnostic signal decay in EHR data across institutions
Quantifying patient-level diagnostic data quality for dementia predictions
Improving model generalization without outcome labels via calibration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed Signal Fidelity Index for data quality
Applied multiplicative calibration using SFI adjustment
Used simulation framework with synthetic datasets
🔎 Similar Papers
No similar papers found.
J
Jingya Cheng
Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
J
Jiazi Tian
Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Federica Spoto
Federica Spoto
Astronomer, Center for Astrophysics, Harvard & Smithsonian
celestial mechanicsastronomyapplied mathematics
A
Alaleh Azhir
Department of Medicine, Brigham and Women’s Hospital, Boston, MA, USA
D
Daniel Mork
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Hossein Estiri
Hossein Estiri
Harvard Medical School
Research InformaticsData ScienceAIDemography