Multimodal signal fusion for stress detection using deep neural networks: a novel approach for converting 1D signals to unified 2D images

πŸ“… 2025-09-16
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the challenges of fusing multimodal physiological signals (PPG, GSR, ACC), weak modeling of temporal dependencies, and poor interpretability, this paper proposes a structured 2D image transformation method: each univariate signal channel is converted into a unified RGB-like image matrix via time-frequency mapping and spatial arrangement, explicitly encoding cross-modal correlations in pixel space. Building upon this representation, we design a lightweight CNN architecture augmented with multi-stage progressive training and physiology-aware data augmentation based on image transformations. Experiments on a public stress recognition dataset demonstrate a 6.2% improvement in F1-score over conventional sequential models. The learned image representations exhibit both high discriminability and visual interpretability, enabling real-time, robust, and personalized stress monitoring on wearable devicesβ€”a novel paradigm for edge-deployable physiological analytics.

Technology Category

Application Category

πŸ“ Abstract
This study introduces a novel method that transforms multimodal physiological signalsphotoplethysmography (PPG), galvanic skin response (GSR), and acceleration (ACC) into 2D image matrices to enhance stress detection using convolutional neural networks (CNNs). Unlike traditional approaches that process these signals separately or rely on fixed encodings, our technique fuses them into structured image representations that enable CNNs to capture temporal and cross signal dependencies more effectively. This image based transformation not only improves interpretability but also serves as a robust form of data augmentation. To further enhance generalization and model robustness, we systematically reorganize the fused signals into multiple formats, combining them in a multi stage training pipeline. This approach significantly boosts classification performance. While demonstrated here in the context of stress detection, the proposed method is broadly applicable to any domain involving multimodal physiological signals, paving the way for more accurate, personalized, and real time health monitoring through wearable technologies.
Problem

Research questions and friction points this paper is trying to address.

Transforming multimodal signals into 2D images
Enhancing stress detection using CNNs
Improving generalization through multi-stage training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transforms 1D signals into 2D image matrices
Fuses multimodal signals into structured image representations
Uses multi-stage training pipeline for enhanced generalization
πŸ”Ž Similar Papers
No similar papers found.
Y
Yasin Hasanpoor
Department of Mechatronics Engineering, University of Tehran, North Karegar Street, Tehran 13769, Iran
Bahram Tarvirdizadeh
Bahram Tarvirdizadeh
Associate Professor of Mechatronics Eng., Faculty of New Sciences and Tech., University of Tehran
MechatronicsRoboticsAutomationMedical and rehabilitation mechatronics
Khalil Alipour
Khalil Alipour
Associate Professor, Department of Mechatronics Engineering, University of Tehran
RoboticsControl TheoryMachine/Deep LearningModeling and Simulation
M
Mohammad Ghamari
Department of Electrical Engineering, California Polytechnic State University, San Luis Obispo, CA 93407, USA