π€ AI Summary
This study addresses the dual challenges of scarce cardiac ultrasound data and stringent patient privacy requirements by introducing the first generative foundation model specifically designed for cardiac ultrasound. Methodologically, we propose a privacy-aware latent-space re-identification mechanism and a novel ejection fraction (EF)-conditioned video-stream alignment paradigm, integrating adversarial variational autoencoders with conditional diffusion modeling to generate anatomically consistent and clinically interpretable high-fidelity images and videos. Key contributions include: (1) the first empirical demonstration that purely synthetic ultrasound data achieves performance on EF regression comparable to real data (MAE β€ 4.2%); and (2) effective mitigation of raw-image leakage risks via our privacy-preserving architecture. The model and associated benchmark dataset are publicly released, establishing a new privacy-compliant paradigm for medical AI development.
π Abstract
Advances in deep learning have significantly enhanced medical image analysis, yet the availability of large-scale medical datasets remains constrained by patient privacy concerns. We present EchoFlow, a novel framework designed to generate high-quality, privacy-preserving synthetic echocardiogram images and videos. EchoFlow comprises four key components: an adversarial variational autoencoder for defining an efficient latent representation of cardiac ultrasound images, a latent image flow matching model for generating accurate latent echocardiogram images, a latent re-identification model to ensure privacy by filtering images anatomically, and a latent video flow matching model for animating latent images into realistic echocardiogram videos conditioned on ejection fraction. We rigorously evaluate our synthetic datasets on the clinically relevant task of ejection fraction regression and demonstrate, for the first time, that downstream models trained exclusively on EchoFlow-generated synthetic datasets achieve performance parity with models trained on real datasets. We release our models and synthetic datasets, enabling broader, privacy-compliant research in medical ultrasound imaging at https://huggingface.co/spaces/HReynaud/EchoFlow.