Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven Transformer

📅 2024-02-07
🏛️ arXiv.org
📈 Citations: 7
Influential: 0
📄 PDF

career value

211K/year
🤖 AI Summary
To address the excessive power consumption of conventional artificial neural networks (ANNs) in mobile remote photoplethysmography (rPPG)-based heart rate estimation, this work pioneers the integration of spiking neural networks (SNNs) into the rPPG domain, proposing a hybrid architecture named Spiking-PhysFormer. Its core contributions include: (i) a parallel spiking Transformer module featuring a simplified spike-based self-attention mechanism—omitting the value projection while preserving accuracy; and (ii) synergistic ANN-SNN modeling, event-driven computation, and parallel spatiotemporal feature aggregation. Evaluated on four benchmark datasets (PURE, UBFC-rPPG, etc.), Spiking-PhysFormer achieves a 12.4% reduction in total inference energy consumption, with the Transformer module alone consuming 12.2× less energy than its ANN counterpart. Heart rate estimation accuracy remains on par with state-of-the-art ANNs (e.g., PhysFormer). This work establishes a new paradigm for low-power, edge-deployable physiological signal sensing.

Technology Category

Application Category

📝 Abstract
Artificial neural networks (ANNs) can help camera-based remote photoplethysmography (rPPG) in measuring cardiac activity and physiological signals from facial videos, such as pulse wave, heart rate and respiration rate with better accuracy. However, most existing ANN-based methods require substantial computing resources, which poses challenges for effective deployment on mobile devices. Spiking neural networks (SNNs), on the other hand, hold immense potential for energy-efficient deep learning owing to their binary and event-driven architecture. To the best of our knowledge, we are the first to introduce SNNs into the realm of rPPG, proposing a hybrid neural network (HNN) model, the Spiking-PhysFormer, aimed at reducing power consumption. Specifically, the proposed Spiking-PhyFormer consists of an ANN-based patch embedding block, SNN-based transformer blocks, and an ANN-based predictor head. First, to simplify the transformer block while preserving its capacity to aggregate local and global spatio-temporal features, we design a parallel spike transformer block to replace sequential sub-blocks. Additionally, we propose a simplified spiking self-attention mechanism that omits the value parameter without compromising the model's performance. Experiments conducted on four datasets-PURE, UBFC-rPPG, UBFC-Phys, and MMPD demonstrate that the proposed model achieves a 12.4% reduction in power consumption compared to PhysFormer. Additionally, the power consumption of the transformer block is reduced by a factor of 12.2, while maintaining decent performance as PhysFormer and other ANN-based models.
Problem

Research questions and friction points this paper is trying to address.

Energy-efficient
Heartbeat Measurement
Neural Network
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spiking Neural Networks
Energy Efficiency
Physiological Signal Processing