Cross-View Attention Fusion Net: A Prior-Guided Dual-View Representation Learning for Cardiac Output Estimation from Short-Term PPG Signals

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Accurately estimating cardiac output (CO) from short-duration photoplethysmography (PPG) signals remains highly challenging due to the joint influence of cardiac function and vascular tone, and existing methods either rely heavily on precise pulse detection or neglect physiological priors and temporal dynamics. This work proposes CVAF-Net, which, for the first time, incorporates a feature sequence graph constructed from physiological priors as an independent view and fuses it with the raw PPG time series through a dual-branch architecture and cross-view attention mechanism. This approach significantly reduces computational cost while preserving physiological interpretability. Experiments demonstrate that the model achieves strong performance on both simulated and real-world data (simulated MAE: 0.19 L/min, MAPE: 3.95%; real-world MAE as low as 1.20 L/min), requires 12× fewer FLOPs than state-of-the-art Transformer-based models, and yields predictions consistent with established physiological principles.

📝 Abstract

Accurate cardiac output (CO) estimation from photoplethysmography (PPG) is promising for unobtrusive hemodynamic monitoring, but remains difficult since CO is jointly determined by cardiac function and vascular tone. Conventional feature-based models use physiologically meaningful PPG descriptors, yet depend on accurate pulse detection and may miss latent temporal relationships. In contrast, fully end-to-end deep learning models learn directly from raw PPG but often underuse established PPG-derived prior information. Here, we introduce the Cross-View Attention Fusion Network (CVAF-Net), a prior-guided dual-view deep learning model for CO estimation from short, fixed-length PPG segments. CVAF-Net processes raw PPG as a temporal view and a feature sequence map (FSM) as a structured prior-guided view, and fuses the two representations through cross-view attention. The model was independently evaluated using 5-, 15-, and 30-s segments from three datasets: simulated pulse waves (3323 subjects), vasoconstriction provocation (79 subjects), and resting/cycling activities (10 subjects), and was compared with multiple machine learning and deep learning benchmarks. CVAF-Net outperformed most benchmark methods and achieved performance comparable to a state-of-the-art Transformer-based model, with a mean absolute error (MAE) of 0.19 L/min (MAPE: 3.95%) on simulated data and high accuracy in real-world settings (minimum MAE: 1.20 L/min). Importantly, CVAF-Net reduced FLOPs by twelvefold compared with the leading Transformer-based model. Plausibility analysis showed physiologically consistent CO estimates, with expected correlations with age ($ρ= -0.274$), heart rate ($ρ= 0.894$), and systemic vascular resistance ($ρ= -0.740$). These findings indicate that CVAF-Net provides an accurate, computationally efficient, and generalizable approach for continuous wearable-based CO monitoring.

Problem

Research questions and friction points this paper is trying to address.

cardiac output estimation

photoplethysmography

PPG-derived prior

short-term PPG signals

hemodynamic monitoring

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-View Attention

Prior-Guided Representation

Dual-View Learning