GlyRAG: Context-Aware Retrieval-Augmented Framework for Blood Glucose Forecasting

📅 2026-01-08

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This work proposes GlyRAG, a novel framework that addresses the limitations of existing glucose prediction models—namely, their neglect of contextual information and reliance on hard-to-deploy auxiliary sensors—by leveraging a large language model (LLM) as a context extractor to generate clinical semantic summaries directly from continuous glucose monitoring (CGM) data. GlyRAG integrates these textual summaries with physiological signals through a multimodal Transformer and enhances long-term forecasting via a retrieval-augmented mechanism that incorporates historically similar cases. Requiring no additional sensors, the approach achieves semantic awareness and case-based enhancement, yielding up to a 39% reduction in RMSE, a 51% improvement in hypoglycemia/hyperglycemia event prediction accuracy, and 85% of predictions falling within the clinically safe zone across two type 1 diabetes cohorts.

Technology Category

Application Category

📝 Abstract

Accurate forecasting of blood glucose from CGM is essential for preventing dysglycemic events, thus enabling proactive diabetes management. However, current forecasting models treat blood glucose readings captured using CGMs as a numerical sequence, either ignoring context or relying on additional sensors/modalities that are difficult to collect and deploy at scale. Recently, LLMs have shown promise for time-series forecasting tasks, yet their role as agentic context extractors in diabetes care remains largely unexplored. To address these limitations, we propose GlyRAG, a context-aware, retrieval-augmented forecasting framework that derives semantic understanding of blood glucose dynamics directly from CGM traces without requiring additional sensor modalities. GlyRAG employs an LLM as a contextualization agent to generate clinical summaries. These summaries are embedded by a language model and fused with patch-based glucose representations in a multimodal transformer architecture with a cross translation loss aligining textual and physiological embeddings. A retrieval module then identifies similar historical episodes in the learned embedding space and uses cross-attention to integrate these case-based analogues prior to making a forecasting inference. Extensive evaluations on two T1D cohorts show that GlyRAG consistently outperforms state-of-the art methods, achieving up to 39% lower RMSE and a further 1.7% reduction in RMSE over the baseline. Clinical evaluation shows that GlyRAG places 85% predictions in safe zones and achieves 51% improvement in predicting dysglycemic events across both cohorts. These results indicate that LLM-based contextualization and retrieval over CGM traces can enhance the accuracy and clinical reliability of long-horizon glucose forecasting without the need for extra sensors, thus supporting future agentic decision-support tools for diabetes management.

Problem

Research questions and friction points this paper is trying to address.

blood glucose forecasting

context-aware

retrieval-augmented

CGM

diabetes management

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation

Large Language Models

Blood Glucose Forecasting