An empirical evaluation of the risks of AI model updates using clinical data: stability, arbitrariness, and fairness

📅 2026-04-26

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

This study addresses the risks posed by updating clinical AI models—namely, increased prediction instability, heightened arbitrariness, and diminished fairness across subpopulations—which can undermine decision reliability. Focusing on predicting severe hyperglycemic events in pediatric type 1 diabetes, the work presents the first systematic evaluation of how different model update strategies affect stability, arbitrariness, and fairness. Leveraging four public datasets encompassing 496 patients and approximately 11,300 weeks of continuous glucose monitoring data, augmented with sociodemographic variables, the authors develop a multidimensional continuous monitoring framework to quantify predictive consistency and error equity. Their findings reveal that model updates can induce substantial prediction reversals and exacerbate error imbalances among subgroups, underscoring the critical need for dynamic monitoring and offering methodological guidance for the responsible deployment of clinical AI systems.

Technology Category

Application Category

📝 Abstract

Artificial Intelligence and Machine Learning (AI/ML) models used in clinical settings are increasingly deployed to support clinical decision-making. However, when training data become stale due to changes in demographics, environment, or patient behaviors, model performance can degrade substantially. While updating models with new training data is necessary, such updates may also introduce new risks. We evaluated the proposed monitoring framework on four publicly available U.S.-based Type 1 Diabetes datasets containing high-resolution continuous glucose monitoring (CGM) data, comprising approximately 11,300 weekly observations from 496 participants under 20 years of age. All datasets included structured sociodemographic information. Using the prediction of severe hyperglycemia events in children with type 1 diabetes as a case study, we examine how different model update strategies can adversely affect model stability (e.g., by causing predictions to "flip" for a large number of cases after an update), increase arbitrariness in predictions, or worsen accuracy equity and the balance of error rates across subpopulations. We propose multiple dimensions for continuous monitoring to detect these issues and argue that such monitoring is essential for the development of trustworthy clinical decision support systems.

Problem

Research questions and friction points this paper is trying to address.

AI model updates

clinical decision-making

model stability

prediction arbitrariness

fairness

Innovation

Methods, ideas, or system contributions that make the work stand out.

model updating

stability

fairness