From Scanning Guidelines to Action: A Robotic Ultrasound Agent with LLM-Based Reasoning

📅 2026-03-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the first large language model (LLM)-based agent framework for robotic ultrasound, addressing the limitations of conventional systems that rely on fixed protocols and task-specific models and thus struggle to meet diverse clinical scanning demands. By interpreting clinical ultrasound guidelines, dynamically invoking appropriate tools, and optimizing decision-making through reinforcement learning, the framework enables interpretable and generalizable autonomous scanning. The approach demonstrates high accuracy in reasoning and tool invocation across ten distinct ultrasound guidelines and exhibits practical cross-task feasibility and generalization capability in robotic scans of the gallbladder, spine, and kidneys.

Technology Category

Application Category

📝 Abstract
Robotic ultrasound offers advantages over free-hand scanning, including improved reproducibility and reduced operator dependency. In clinical practice, US acquisition relies heavily on the sonographer's experience and situational judgment. When transferring this process to robotic systems, such expertise is often encoded explicitly through fixed procedures and task-specific models, yielding pipelines that can be difficult to adapt to new scanning tasks. In this work, we propose a unified framework for autonomous robotic US scanning that leverages a LLM-based agent to interpret US scanning guidelines and execute scans by dynamically invoking a set of provided software tools. Instead of encoding fixed scanning procedures, the LLM agent retrieves and reasons over guideline steps from scanning handbooks and adapts its planning decisions based on observations and the current scanning state. This enables the system to handle variable and decision-dependent workflows, such as adjusting scanning strategies, repeating steps, or selecting the appropriate next tool call in response to image quality or anatomical findings. Because the reasoning underlying tool selection is also critical for transparent and trustworthy planning, we further fine tune the LLM agent using a RL based strategy to improve both its reasoning quality and the correctness of tool selection and parameterization, while maintaining robust generalization to unseen guidelines and related tasks. We first validate the approach via verbal execution on 10 US scanning guidelines, assessing reasoning as well as tool selection and parameterization, and showing the benefit of RL fine tuning. We then demonstrate real world feasibility on robotic scanning of the gallbladder, spine, and kidney. Overall, the framework follows diverse guidelines and enables reliable autonomous scanning across multiple anatomical targets within a unified system.
Problem

Research questions and friction points this paper is trying to address.

robotic ultrasound
autonomous scanning
LLM-based reasoning
scanning guidelines
adaptive workflow
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based agent
robotic ultrasound
autonomous scanning
reinforcement learning fine-tuning
dynamic tool invocation
🔎 Similar Papers
No similar papers found.
Yuan Bi
Yuan Bi
Technical University of Munich
Robotic UltrasoundUltrasound Image Processing
Y
Yiping Zhou
Chair for Computer-Aided Medical Procedures and Augmented Reality, Technical University of Munich, Munich, Germany
P
Pei Liu
Chair for Computer-Aided Medical Procedures and Augmented Reality, Technical University of Munich, Munich, Germany
Feng Li
Feng Li
Technical University of Munich; Ph.D. Student
Zhongliang Jiang
Zhongliang Jiang
University of Hong Kong
Medical RoboticsUltrasound imagingRobot learningSurgical RoboticsHuman-robot Interaction
Nassir Navab
Nassir Navab
Professor of Computer Science, Technische Universität München