🤖 AI Summary
Existing automated methods for fetal ultrasound analysis struggle to balance diagnostic accuracy with end-to-end compatibility across the full clinical workflow and lack unified support for both images and video. This work proposes the first multi-agent system tailored for fetal ultrasound, leveraging a lightweight coordination framework to dynamically orchestrate specialized vision experts for end-to-end processing of diagnosis, biometric measurement, segmentation, and video key-frame summarization. The system further integrates patient metadata to generate structured clinical reports. By introducing a multi-agent architecture that enables dynamic task coordination and real-time video stream understanding, the approach significantly outperforms both task-specific models and multimodal large language models across eight clinical tasks in a multicenter external evaluation, demonstrating superior accuracy, robustness, and clinical auditability.
📝 Abstract
Fetal ultrasound (US) is the primary imaging modality for prenatal screening, yet its interpretation relies heavily on the expertise of the clinician. Despite advances in deep learning and foundation models, existing automated tools for fetal US analysis struggle to balance task-specific accuracy with the whole-process versatility required to support end-to-end clinical workflows. To address these limitations, we propose FetalAgents, the first multi-agent system for comprehensive fetal US analysis. Through a lightweight, agentic coordination framework, FetalAgents dynamically orchestrates specialized vision experts to maximize performance across diagnosis, measurement, and segmentation. Furthermore, FetalAgents advances beyond static image analysis by supporting end-to-end video stream summarization, where keyframes are automatically identified across multiple anatomical planes, analyzed by coordinated experts, and synthesized with patient metadata into a structured clinical report. Extensive multi-center external evaluations across eight clinical tasks demonstrate that FetalAgents consistently delivers the most robust and accurate performance when compared against specialized models and multimodal large language models (MLLMs), ultimately providing an auditable, workflow-aligned solution for fetal ultrasound analysis and reporting.