π€ AI Summary
Current unified multimodal models (e.g., GPT-5) lack electrocardiogram (ECG) signal understanding and generation capabilities. To address this, we propose UniECGβthe first unified model supporting evidence-driven ECG interpretation and text-conditioned ECG generation. Our method employs a decoupled two-stage training paradigm: Stage I trains an ECG-to-text model for evidence-based diagnostic reasoning; Stage II introduces a latent-space alignment mechanism that maps textual semantics into ECG latent representations, enabling bidirectional cross-modal modeling. This design unifies ECG understanding and generation within a single architecture for the first time, substantially expanding the frontier of intelligent ECG analysis. Experiments demonstrate that UniECG achieves state-of-the-art performance in both diagnostic accuracy and generation fidelity, while supporting instruction-driven, autonomous task switching. The code and model will be publicly released.
π Abstract
Recent unified models such as GPT-5 have achieved encouraging progress on vision-language tasks. However, these unified models typically fail to correctly understand ECG signals and provide accurate medical diagnoses, nor can they correctly generate ECG signals. To address these limitations, we propose UniECG, the first unified model for ECG capable of concurrently performing evidence-based ECG interpretation and text-conditioned ECG generation tasks. Through a decoupled two-stage training approach, the model first learns evidence-based interpretation skills (ECG-to-Text), and then injects ECG generation capabilities (Text-to-ECG) via latent space alignment. UniECG can autonomously choose to interpret or generate an ECG based on user input, significantly extending the capability boundaries of current ECG models. Our code and checkpoints will be made publicly available at https://github.com/PKUDigitalHealth/UniECG upon acceptance.