Multimodal signal fusion for stress detection using deep neural networks: a novel approach for converting 1D signals to unified 2D images

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

To address the challenges of fusing multimodal physiological signals (PPG, GSR, ACC), weak modeling of temporal dependencies, and poor interpretability, this paper proposes a structured 2D image transformation method: each univariate signal channel is converted into a unified RGB-like image matrix via time-frequency mapping and spatial arrangement, explicitly encoding cross-modal correlations in pixel space. Building upon this representation, we design a lightweight CNN architecture augmented with multi-stage progressive training and physiology-aware data augmentation based on image transformations. Experiments on a public stress recognition dataset demonstrate a 6.2% improvement in F1-score over conventional sequential models. The learned image representations exhibit both high discriminability and visual interpretability, enabling real-time, robust, and personalized stress monitoring on wearable devices—a novel paradigm for edge-deployable physiological analytics.

Technology Category

Application Category

📝 Abstract

This study introduces a novel method that transforms multimodal physiological signalsphotoplethysmography (PPG), galvanic skin response (GSR), and acceleration (ACC) into 2D image matrices to enhance stress detection using convolutional neural networks (CNNs). Unlike traditional approaches that process these signals separately or rely on fixed encodings, our technique fuses them into structured image representations that enable CNNs to capture temporal and cross signal dependencies more effectively. This image based transformation not only improves interpretability but also serves as a robust form of data augmentation. To further enhance generalization and model robustness, we systematically reorganize the fused signals into multiple formats, combining them in a multi stage training pipeline. This approach significantly boosts classification performance. While demonstrated here in the context of stress detection, the proposed method is broadly applicable to any domain involving multimodal physiological signals, paving the way for more accurate, personalized, and real time health monitoring through wearable technologies.

Problem

Research questions and friction points this paper is trying to address.

Transforming multimodal signals into 2D images

Enhancing stress detection using CNNs

Improving generalization through multi-stage training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transforms 1D signals into 2D image matrices

Fuses multimodal signals into structured image representations

Uses multi-stage training pipeline for enhanced generalization

🔎 Similar Papers

No similar papers found.