ViFusionTST: Deep Fusion of Time-Series Image Representations from Load Signals for Early Bed-Exit Prediction

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Bedside fall prevention in long-term care remains challenging due to the difficulty of early detecting residents’ intent to get out of bed. Method: We propose a low-cost, non-contact approach using four load sensors mounted on bed legs to capture one-dimensional weight signals. These signals are transformed into RGB curve images and three types of time-series texture images—recurrence plots, Markov transition fields, and Gramian angular fields. A novel dual-stream Swin Transformer, ViFusionTST, is introduced, incorporating cross-attention to enable adaptive, deep fusion across image modalities—eliminating handcrafted feature engineering. Contribution/Results: The data-driven framework automatically learns optimal multimodal weights and achieves 0.885 accuracy and 0.794 F1-score in real-world clinical settings—outperforming state-of-the-art 1D and 2D time-series models. This validates the efficacy and clinical practicality of signal-to-image conversion combined with dual-stream vision Transformers for contactless behavioral intention recognition.

Technology Category

Application Category

📝 Abstract
Bed-related falls remain a leading source of injury in hospitals and long-term-care facilities, yet many commercial alarms trigger only after a patient has already left the bed. We show that early bed-exit intent can be predicted using only four low-cost load cells mounted under the bed legs. The resulting load signals are first converted into a compact set of complementary images: an RGB line plot that preserves raw waveforms and three texture maps - recurrence plot, Markov transition field, and Gramian angular field - that expose higher-order dynamics. We introduce ViFusionTST, a dual-stream Swin Transformer that processes the line plot and texture maps in parallel and fuses them through cross-attention to learn data-driven modality weights. To provide a realistic benchmark, we collected six months of continuous data from 95 beds in a long-term-care facility. On this real-world dataset ViFusionTST reaches an accuracy of 0.885 and an F1 score of 0.794, surpassing recent 1D and 2D time-series baselines across F1, recall, accuracy, and AUPRC. The results demonstrate that image-based fusion of load-sensor signals for time series classification is a practical and effective solution for real-time, privacy-preserving fall prevention.
Problem

Research questions and friction points this paper is trying to address.

Predicting early bed-exit intent to prevent falls
Fusing image representations from load signals
Improving accuracy in real-world bed-exit detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses load cells for early bed-exit prediction
Converts signals to RGB and texture images
Dual-stream Swin Transformer with cross-attention fusion
🔎 Similar Papers
No similar papers found.
H
Hao Liu
The University of British Columbia, Canada
Y
Yu Hu
The University of British Columbia, Canada
Rakiba Rayhana
Rakiba Rayhana
Postdoctoral Researcher, The University of British Columbia
L
Ling Bai
The University of British Columbia, Canada
Z
Zheng Liu
The University of British Columbia, Canada