Auto-US: An Ultrasound Video Diagnosis Agent Using Video Classification Framework and LLMs

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

Existing AI-assisted ultrasound video diagnosis research suffers from insufficient data diversity, limited model performance, and poor clinical interpretability. To address these challenges, this work proposes a multimodal intelligent diagnostic framework. First, we introduce CUV—the first multi-source, heterogeneous ultrasound video dataset targeting three organs (thyroid, breast, liver), comprising 495 videos covering five lesion categories. Second, we design CTU-Net, a lightweight spatiotemporal fusion network that efficiently models video features, achieving an 86.73% classification accuracy. Third, we integrate a large language model (LLM) to align visual representations with clinical text, generating interpretable, guideline-compliant diagnostic reports; physician evaluations yield a mean score of 3.2/5. This is the first study to enable end-to-end ultrasound video–text joint reasoning, significantly enhancing diagnostic efficiency, accuracy, and clinical deployability.

Technology Category

Application Category

📝 Abstract

AI-assisted ultrasound video diagnosis presents new opportunities to enhance the efficiency and accuracy of medical imaging analysis. However, existing research remains limited in terms of dataset diversity, diagnostic performance, and clinical applicability. In this study, we propose extbf{Auto-US}, an intelligent diagnosis agent that integrates ultrasound video data with clinical diagnostic text. To support this, we constructed extbf{CUV Dataset} of 495 ultrasound videos spanning five categories and three organs, aggregated from multiple open-access sources. We developed extbf{CTU-Net}, which achieves state-of-the-art performance in ultrasound video classification, reaching an accuracy of 86.73% Furthermore, by incorporating large language models, Auto-US is capable of generating clinically meaningful diagnostic suggestions. The final diagnostic scores for each case exceeded 3 out of 5 and were validated by professional clinicians. These results demonstrate the effectiveness and clinical potential of Auto-US in real-world ultrasound applications. Code and data are available at: https://github.com/Bean-Young/Auto-US.

Problem

Research questions and friction points this paper is trying to address.

Develops AI agent for ultrasound video diagnosis

Addresses limitations in dataset diversity and diagnostic performance

Integrates video classification with clinical text analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates ultrasound video data with clinical diagnostic text

Develops CTU-Net for ultrasound video classification

Incorporates large language models for diagnostic suggestions

🔎 Similar Papers

No similar papers found.