🤖 AI Summary
To address ECG data scarcity, privacy sensitivity, and high annotation costs, this paper proposes PhysioDiff—the first text-to-ECG generation framework integrating physiological modeling and clinical knowledge. Methodologically, it innovatively embeds a lightweight ODE-based electrocardiographic simulator into a diffusion model, imposing physiological consistency constraints; concurrently, it leverages large language models (LLMs) to retrieve structured clinical expertise for generation guidance and designs a multimodal alignment strategy to ensure precise text-signal semantic mapping. Experiments on multiple real-world ECG datasets demonstrate that PhysioDiff significantly improves signal fidelity (FID reduced by 32%) and text-ECG alignment (CLIPScore increased by 28%). Moreover, generated samples enhance downstream disease classification accuracy by 4.7%. PhysioDiff establishes a novel paradigm for mechanistic investigation and privacy-compliant data sharing.
📝 Abstract
Cardiovascular disease (CVD) is a leading cause of mortality worldwide. Electrocardiograms (ECGs) are the most widely used non-invasive tool for cardiac assessment, yet large, well-annotated ECG corpora are scarce due to cost, privacy, and workflow constraints. Generating ECGs can be beneficial for the mechanistic understanding of cardiac electrical activity, enable the construction of large, heterogeneous, and unbiased datasets, and facilitate privacy-preserving data sharing. Generating realistic ECG signals from clinical context is important yet underexplored. Recent work has leveraged diffusion models for text-to-ECG generation, but two challenges remain: (i) existing methods often overlook the physiological simulator knowledge of cardiac activity; and (ii) they ignore broader, experience-based clinical knowledge grounded in real-world practice. To address these gaps, we propose SE-Diff, a novel physiological simulator and experience enhanced diffusion model for comprehensive ECG generation. SE-Diff integrates a lightweight ordinary differential equation (ODE)-based ECG simulator into the diffusion process via a beat decoder and simulator-consistent constraints, injecting mechanistic priors that promote physiologically plausible waveforms. In parallel, we design an LLM-powered experience retrieval-augmented strategy to inject clinical knowledge, providing more guidance for ECG generation. Extensive experiments on real-world ECG datasets demonstrate that SE-Diff improves both signal fidelity and text-ECG semantic alignment over baselines, proving its superiority for text-to-ECG generation. We further show that the simulator-based and experience-based knowledge also benefit downstream ECG classification.