Compact Latent Manifold Translation: A Parameter-Efficient Foundation Model for Cross-Modal and Cross-Frequency Physiological Signal Synthesis

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

218K/year
🤖 AI Summary
This work addresses the limitations of existing physiological signal foundation models, which suffer from modality entanglement due to device heterogeneity, poor cross-frequency generation capability, and high computational overhead. The authors propose a two-stage discrete translation paradigm: first, a hierarchical residual vector quantization scheme constructs a universal tokenizer that disentangles heterogeneous signals—such as ECG and PPG—into structured discrete latent representations; second, a physiology-informed, context-prompt-driven latent translator enables cross-modal sequence conversion. This framework is the first to unify modeling in a discrete latent space, effectively eliminating modality interference and substantially improving fidelity in both cross-modal synthesis and cross-frequency super-resolution, while reducing model size to 0.09B parameters for edge deployment. Experiments show the F1 score for R-peak detection in PPG-to-ECG synthesis improves from 0.37 to 0.83, and Pearson correlation reaches 0.9956 in 25Hz-to-100Hz super-resolution, significantly outperforming large-scale baselines.
📝 Abstract
The analysis of physiological time series, such as electrocardiograms (ECG) and photoplethysmograms (PPG), is persistently hindered by modality and frequency gaps stemming from heterogeneous recording devices. Existing foundation models typically rely on continuous latent spaces, which frequently suffer from severe modality entanglement, lack high-fidelity cross-frequency generative capacity, and impose high computational costs that prohibit edge-device deployment. In this paper, we propose Compact Latent Manifold Translation (CLMT), a highly parameter-efficient (0.09B) unified framework that bridges these gaps through a novel two-stage discrete translation paradigm. First, we introduce a Universal Tokenizer utilizing Hierarchical Residual Vector Quantization (RVQ) to decouple heterogeneous signals into isolated, well-structured discrete latent manifolds, effectively preventing inter-modality interference. Second, a Context-Prompted Latent Translator maps these discrete tokens across modalities by integrating static physiological priors, reframing complex signal synthesis as a pure latent sequence translation task. Extensive evaluations demonstrate that our 0.09B model significantly outperforms massive baselines. In cross-modal PPG-to-ECG synthesis, it resolves temporal phase drift and dramatically improves the clinical R-peak detection F1-score from 0.37 (baseline) to 0.83. Furthermore, in extreme cross-frequency super-resolution (25Hz to 100Hz), it successfully recovers high-frequency diagnostic landmarks, achieving an unprecedented Pearson correlation of 0.9956. By learning a universal discrete language for biological signals with a fraction of the computational footprint, our approach sets a new trajectory for edge-deployable, multi-modal medical foundation models.
Problem

Research questions and friction points this paper is trying to address.

cross-modal synthesis
cross-frequency synthesis
physiological signals
modality gap
edge deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

discrete latent manifold
cross-modal synthesis
parameter-efficient foundation model
hierarchical residual vector quantization
physiological signal translation
🔎 Similar Papers
No similar papers found.
Bo Cui
Bo Cui
Eastern Institute of Technology, Ningbo
NanofabricationMEMSelectron beam and nanoimprint lithography
X
Xiaowen Song
Department of Biomedical Signals and Systems, University of Twente
Y
Yaowen Zhang
Department of Biomedical Signals and Systems, University of Twente
S
Shunzhe Zhang
Department of Applied Mathematics, University of Twente
B
B. J. F. van Beijnum
Department of Biomedical Signals and Systems, University of Twente
Monique Tabak
Monique Tabak
University of Twente
eHealthtelemedicinepersonalized healthrehabilitationchronic disease
Ying Wang
Ying Wang
Northern Illinois University, Department of Operations Management and Information Systems
Deep LearningDigital EconomyCorporate Social ResponsibilityDiffusion of IT