🤖 AI Summary
Existing ultrasound analysis methods struggle to generalize across organs, imaging views, and devices, and lack interpretable end-to-end workflows. To address this, this work proposes USTri—the first end-to-end intelligent ultrasound agent system—employing a three-stage training strategy: first, a universal foundation model (USGen) is pretrained; then, task-specific heads are fine-tuned with the backbone frozen to yield USpec; finally, a coordinating agent (USAgent) orchestrates multi-expert collaborative reasoning to emulate clinical workflows and generate structured reports. USTri unifies modeling across multiple organs and tasks, effectively mitigating task interference while preserving shared knowledge. Evaluated on the FMC_UIA validation set—spanning four tasks and 27 datasets—USTri significantly outperforms current state-of-the-art methods, delivering both high accuracy and clinically interpretable reports.
📝 Abstract
Clinical ultrasound analysis demands models that generalize across heterogeneous organs, views, and devices, while supporting interpretable workflow-level analysis. Existing methods often rely on task-wise adaptation, and joint learning may be unstable due to cross-task interference, making it hard to deliver workflow-level outputs in practice. To address these challenges, we present USTri, a tri-stage ultrasound intelligence pipeline for unified multi-organ, multi-task analysis. Stage I trains a universal generalist USGen on different domains to learn broad, transferable priors that are robust to device and protocol variability. To better handle domain shifts and reach task-aligned performance while preserving ultrasound shared knowledge, Stage II builds USpec by keeping USGen frozen and finetuning dataset-specific heads. Stage III introduces USAgent, which mimics clinician workflows by orchestrating USpec specialists for multi-step inference and deterministic structured reports. On the FMC\_UIA validation set, our model achieves the best overall performance across 4 task types and 27 datasets, outperforming state-of-the-art methods. Moreover, qualitative results show that USAgent produces clinically structured reports with high accuracy and interpretability. Our study suggests a scalable path to ultrasound intelligence that generalizes across heterogeneous ultrasound tasks and supports consistent end-to-end clinical workflows.