🤖 AI Summary
Existing wrist-worn photoplethysmography (PPG) devices suffer from low heart rate estimation accuracy under motion artifacts, while current arrhythmia classification methods are largely limited to unimodal, binary classification (e.g., atrial fibrillation vs. sinus rhythm) and lack generalizability across diverse arrhythmias. To address these limitations, we propose RhythmiNet: an end-to-end, multimodal, three-class classifier (atrial fibrillation / sinus rhythm / other arrhythmias) that jointly processes PPG and accelerometer (ACC) signals. RhythmiNet employs a residual architecture augmented with a novel dual-attention mechanism—operating simultaneously across temporal and channel dimensions—to enhance robustness to motion-induced signal degradation. Crucially, it is the first method to evaluate performance stratified by motion intensity without discarding motion-contaminated segments. Experiments demonstrate that RhythmiNet achieves a 4.3% improvement in macro-AUC over a unimodal PPG baseline and a 12% gain over a handcrafted heart rate variability (HRV) feature–based logistic regression model, validating the efficacy and advancement of multimodal fusion and attention modeling for real-world arrhythmia classification.
📝 Abstract
Atrial fibrillation (AF) is a leading cause of stroke and mortality, particularly in elderly patients. Wrist-worn photoplethysmography (PPG) enables non-invasive, continuous rhythm monitoring, yet suffers from significant vulnerability to motion artifacts and physiological noise. Many existing approaches rely solely on single-channel PPG and are limited to binary AF detection, often failing to capture the broader range of arrhythmias encountered in clinical settings. We introduce RhythmiNet, a residual neural network enhanced with temporal and channel attention modules that jointly leverage PPG and accelerometer (ACC) signals. The model performs three-class rhythm classification: AF, sinus rhythm (SR), and Other. To assess robustness across varying movement conditions, test data are stratified by accelerometer-based motion intensity percentiles without excluding any segments. RhythmiNet achieved a 4.3% improvement in macro-AUC over the PPG-only baseline. In addition, performance surpassed a logistic regression model based on handcrafted HRV features by 12%, highlighting the benefit of multimodal fusion and attention-based learning in noisy, real-world clinical data.