Label-free Motion-Conditioned Diffusion Model for Cardiac Ultrasound Synthesis

📅 2025-12-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The scarcity of expert annotations in echocardiography severely hinders the development and deployment of deep learning models. Method: This paper proposes an unsupervised Motion-Conditioned Diffusion Model (MCDM), the first framework for temporally consistent echocardiographic video synthesis without manual labels. It introduces a Motion-Appearance Feature Extractor (MAFE) that disentangles dynamic and static representations via self-supervised learning; incorporates pseudo-optical flow constraints and pseudo-appearance re-identification losses to enhance motion modeling robustness; and leverages a latent diffusion architecture for high-fidelity generation. Results: Evaluated on EchoNet-Dynamic, MCDM generates clinically realistic, frame-coherent echocardiographic videos whose quality rivals that of supervised methods—eliminating reliance on expert annotations entirely. This work establishes a scalable, annotation-free paradigm for medical video generation.

Technology Category

Application Category

📝 Abstract
Ultrasound echocardiography is essential for the non-invasive, real-time assessment of cardiac function, but the scarcity of labelled data, driven by privacy restrictions and the complexity of expert annotation, remains a major obstacle for deep learning methods. We propose the Motion Conditioned Diffusion Model (MCDM), a label-free latent diffusion framework that synthesises realistic echocardiography videos conditioned on self-supervised motion features. To extract these features, we design the Motion and Appearance Feature Extractor (MAFE), which disentangles motion and appearance representations from videos. Feature learning is further enhanced by two auxiliary objectives: a re-identification loss guided by pseudo appearance features and an optical flow loss guided by pseudo flow fields. Evaluated on the EchoNet-Dynamic dataset, MCDM achieves competitive video generation performance, producing temporally coherent and clinically realistic sequences without reliance on manual labels. These results demonstrate the potential of self-supervised conditioning for scalable echocardiography synthesis. Our code is available at https://github.com/ZheLi2020/LabelfreeMCDM.
Problem

Research questions and friction points this paper is trying to address.

Synthesizes cardiac ultrasound videos without manual labels
Extracts motion and appearance features from echocardiography data
Enhances video generation using self-supervised auxiliary objectives
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised motion features condition diffusion model
Motion and appearance feature extractor disentangles representations
Auxiliary losses enhance feature learning without manual labels
🔎 Similar Papers
No similar papers found.