ViFusionTST: Deep Fusion of Time-Series Image Representations from Load Signals for Early Bed-Exit Prediction

📅 2025-06-25

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Bedside fall prevention in long-term care remains challenging due to the difficulty of early detecting residents’ intent to get out of bed. Method: We propose a low-cost, non-contact approach using four load sensors mounted on bed legs to capture one-dimensional weight signals. These signals are transformed into RGB curve images and three types of time-series texture images—recurrence plots, Markov transition fields, and Gramian angular fields. A novel dual-stream Swin Transformer, ViFusionTST, is introduced, incorporating cross-attention to enable adaptive, deep fusion across image modalities—eliminating handcrafted feature engineering. Contribution/Results: The data-driven framework automatically learns optimal multimodal weights and achieves 0.885 accuracy and 0.794 F1-score in real-world clinical settings—outperforming state-of-the-art 1D and 2D time-series models. This validates the efficacy and clinical practicality of signal-to-image conversion combined with dual-stream vision Transformers for contactless behavioral intention recognition.

Technology Category

Application Category

📝 Abstract

Bed-related falls remain a leading source of injury in hospitals and long-term-care facilities, yet many commercial alarms trigger only after a patient has already left the bed. We show that early bed-exit intent can be predicted using only four low-cost load cells mounted under the bed legs. The resulting load signals are first converted into a compact set of complementary images: an RGB line plot that preserves raw waveforms and three texture maps - recurrence plot, Markov transition field, and Gramian angular field - that expose higher-order dynamics. We introduce ViFusionTST, a dual-stream Swin Transformer that processes the line plot and texture maps in parallel and fuses them through cross-attention to learn data-driven modality weights. To provide a realistic benchmark, we collected six months of continuous data from 95 beds in a long-term-care facility. On this real-world dataset ViFusionTST reaches an accuracy of 0.885 and an F1 score of 0.794, surpassing recent 1D and 2D time-series baselines across F1, recall, accuracy, and AUPRC. The results demonstrate that image-based fusion of load-sensor signals for time series classification is a practical and effective solution for real-time, privacy-preserving fall prevention.

Problem

Research questions and friction points this paper is trying to address.

Predicting early bed-exit intent to prevent falls

Fusing image representations from load signals

Improving accuracy in real-world bed-exit detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses load cells for early bed-exit prediction

Converts signals to RGB and texture images

Dual-stream Swin Transformer with cross-attention fusion

🔎 Similar Papers

No similar papers found.