Rethinking Large Language Models For Irregular Time Series Classification In Critical Care

📅 2026-01-23

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This study addresses the challenge of classifying highly incomplete and irregularly sampled time series in intensive care unit (ICU) settings by systematically evaluating two core components of large language models (LLMs): time series encoders and multimodal alignment strategies. Through a unified benchmarking framework on standard ICU datasets, the authors compare multiple state-of-the-art approaches against strong baselines and find that encoder design is substantially more critical than alignment strategy. Specifically, explicitly modeling temporal irregularity in the encoder yields significant performance gains. Experimental results demonstrate that the best encoder improves average AUPRC by 12.8%, while the optimal alignment strategy contributes a 2.9% gain. This work provides the first quantitative assessment of how these two components influence LLM-based temporal modeling and highlights inherent limitations of current LLM methods in training efficiency and few-shot scenarios.

Technology Category

Application Category

📝 Abstract

Time series data from the Intensive Care Unit (ICU) provides critical information for patient monitoring. While recent advancements in applying Large Language Models (LLMs) to time series modeling (TSM) have shown great promise, their effectiveness on the irregular ICU data, characterized by particularly high rates of missing values, remains largely unexplored. This work investigates two key components underlying the success of LLMs for TSM: the time series encoder and the multimodal alignment strategy. To this end, we establish a systematic testbed to evaluate their impact across various state-of-the-art LLM-based methods on benchmark ICU datasets against strong supervised and self-supervised baselines. Results reveal that the encoder design is more critical than the alignment strategy. Encoders that explicitly model irregularity achieve substantial performance gains, yielding an average AUPRC increase of $12.8\%$ over the vanilla Transformer. While less impactful, the alignment strategy is also noteworthy, with the best-performing semantically rich, fusion-based strategy achieving a modest $2.9\%$ improvement over cross-attention. However, LLM-based methods require at least 10$\times$ longer training than the best-performing irregular supervised models, while delivering only comparable performance. They also underperform in data-scarce few-shot learning settings. These findings highlight both the promise and current limitations of LLMs for irregular ICU time series. The code is available at https://github.com/mHealthUnimelb/LLMTS.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Irregular Time Series

ICU

Time Series Classification

Missing Data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Irregular Time Series

Large Language Models

Time Series Encoder