Glucose-ML: A collection of longitudinal diabetes datasets for development of robust AI solutions

📅 2025-07-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diabetes AI research is hindered by the scarcity of high-quality, large-scale, and multi-source clinical data. To address this, we introduce Glucose-ML—a standardized, longitudinal, multi-center dataset integrating ten publicly available, international CGM datasets spanning four countries, over 2,500 patients, and 38 million glucose readings. Leveraging Glucose-ML, we establish a unified benchmark for short-term glycemic forecasting and conduct cross-dataset algorithm evaluation with attribution analysis, revealing substantial impacts of data provenance on model robustness. We open-source our codebase, standardized evaluation pipeline, and evidence-based algorithm selection guidelines to advance reproducible, comparable, and interpretable medical AI research. Results demonstrate pronounced performance variability for identical algorithms across datasets—mean absolute error (MAE) fluctuates by up to 32%—underscoring the critical importance of data representativeness and domain adaptability for clinical AI deployment.

Technology Category

Application Category

📝 Abstract
Artificial intelligence (AI) algorithms are a critical part of state-of-the-art digital health technology for diabetes management. Yet, access to large high-quality datasets is creating barriers that impede development of robust AI solutions. To accelerate development of transparent, reproducible, and robust AI solutions, we present Glucose-ML, a collection of 10 publicly available diabetes datasets, released within the last 7 years (i.e., 2018 - 2025). The Glucose-ML collection comprises over 300,000 days of continuous glucose monitor (CGM) data with a total of 38 million glucose samples collected from 2500+ people across 4 countries. Participants include persons living with type 1 diabetes, type 2 diabetes, prediabetes, and no diabetes. To support researchers and innovators with using this rich collection of diabetes datasets, we present a comparative analysis to guide algorithm developers with data selection. Additionally, we conduct a case study for the task of blood glucose prediction - one of the most common AI tasks within the field. Through this case study, we provide a benchmark for short-term blood glucose prediction across all 10 publicly available diabetes datasets within the Glucose-ML collection. We show that the same algorithm can have significantly different prediction results when developed/evaluated with different datasets. Findings from this study are then used to inform recommendations for developing robust AI solutions within the diabetes or broader health domain. We provide direct links to each longitudinal diabetes dataset in the Glucose-ML collection and openly provide our code.
Problem

Research questions and friction points this paper is trying to address.

Lack of large high-quality diabetes datasets for AI development
Need for transparent and reproducible AI solutions in diabetes management
Variability in AI performance across different diabetes datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Collection of 10 public diabetes datasets
300,000 days CGM data from 2500+ people
Benchmark for blood glucose prediction
🔎 Similar Papers
No similar papers found.