🤖 AI Summary
Psychiatric wearable data exhibit heterogeneity, multi-source nature, high missingness, and severe label scarcity, hindering effective modeling of clinical behavioral dynamics.
Method: We propose the first unsupervised foundation model for medical time-series data, built upon an enhanced Vector-Quantized Variational Autoencoder (VQ-VAE). It constructs a hybrid discrete–continuous latent space to jointly represent heterogeneous multi-source physiological and behavioral signals, integrated with probabilistic change-point detection for unsupervised suicide-risk identification.
Contribution/Results: This work pioneers the use of discrete latent variables for clinical behavioral modeling, enabling zero-shot cross-task transfer—e.g., emotion prediction and change-point detection—without fine-tuning. Experiments demonstrate statistically significant improvements over state-of-the-art time-series models (e.g., Informer) in suicide-risk detection, while achieving comparable performance in emotion prediction. These results validate the efficacy and robustness of discrete representation for modeling clinically relevant anomalous behaviors.
📝 Abstract
Foundation models (FMs) have achieved remarkable success across various domains, yet their adoption in healthcare remains limited. While significant advances have been made in medical imaging, genetic biomarkers, and time series from electronic health records, the potential of FMs for patient behavior monitoring through wearable devices remains underexplored. These datasets are inherently heterogeneous, multisource, and often exhibit high rates of missing data, posing unique challenges. This paper introduces a novel FM based on a modified vector quantized variational autoencoder (VQ-VAE), specifically designed to process real-world data from wearable devices. We demonstrate that our pretrained FM, trained on a broad cohort of psychiatric patients, performs downstream tasks via its latent representation without fine-tuning on a held-out cohort of suicidal patients. To illustrate this, we develop a probabilistic change-point detection algorithm for suicide detection and demonstrate the FM's effectiveness in predicting emotional states. Our results show that the discrete latent structure of the VQ-VAE outperforms a state-of-the-art Informer architecture in unsupervised suicide detection, while matching its performance in supervised emotion prediction when the latent dimensionality is increased, though at the cost of reduced unsupervised accuracy. This trade-off highlights the need for future FMs to integrate hybrid discrete-continuous structures for balanced performance across tasks.