🤖 AI Summary
Cardiac ultrasound diagnosis suffers from operator dependence, geographic disparities in resource availability, and human-induced variability—necessitating reproducible, scalable automation. This paper proposes the first end-to-end autonomous scanning framework integrating generative AI with deep reinforcement learning (DRL). Specifically, we co-train a conditional generative adversarial network (cGAN) coupled with a variational autoencoder (VAE) to synthesize high-fidelity simulated ultrasound images, jointly optimized with a DRL policy for closed-loop, real-time scanning path planning. An integrated image quality assessment and classification module ensures output consistency and diagnostic relevance. We further release the first publicly available, annotated real-world cardiac ultrasound dataset. Experiments demonstrate robust generation of high-quality scanning trajectories across diverse configurations, substantially reducing reliance on expert knowledge. The framework exhibits strong cross-organ generalizability, reproducibility, and clinical deployability.
📝 Abstract
Cardiac ultrasound (US) is among the most widely used diagnostic tools in cardiology for assessing heart health, but its effectiveness is limited by operator dependence, time constraints, and human error. The shortage of trained professionals, especially in remote areas, further restricts access. These issues underscore the need for automated solutions that can ensure consistent, and accessible cardiac imaging regardless of operator skill or location. Recent progress in artificial intelligence (AI), especially in deep reinforcement learning (DRL), has gained attention for enabling autonomous decision-making. However, existing DRL-based approaches to cardiac US scanning lack reproducibility, rely on proprietary data, and use simplified models. Motivated by these gaps, we present the first end-to-end framework that integrates generative AI and DRL to enable autonomous and reproducible cardiac US scanning. The framework comprises two components: (i) a conditional generative simulator combining Generative Adversarial Networks (GANs) with Variational Autoencoders (VAEs), that models the cardiac US environment producing realistic action-conditioned images; and (ii) a DRL module that leverages this simulator to learn autonomous, accurate scanning policies. The proposed framework delivers AI-driven guidance through expert-validated models that classify image type and assess quality, supports conditional generation of realistic US images, and establishes a reproducible foundation extendable to other organs. To ensure reproducibility, a publicly available dataset of real cardiac US scans is released. The solution is validated through several experiments. The VAE-GAN is benchmarked against existing GAN variants, with performance assessed using qualitative and quantitative approaches, while the DRL-based scanning system is evaluated under varying configurations to demonstrate effectiveness.