Intelligent Virtual Sonographer (IVS): Enhancing Physician-Robot-Patient Communication

📅 2025-07-17
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
Existing robotic ultrasound research primarily focuses on binary interactions—either patient–robot or clinician–robot—neglecting communication gaps in triadic clinician–robot–patient collaboration. This paper introduces the Intelligent Virtual Sonographer (IVS), the first multimodal virtual agent designed for extended reality (XR) environments. IVS integrates large language models, automatic speech recognition, text-to-speech synthesis, and robotic control to enable natural-language-driven real-time triadic interaction. It accurately interprets clinician commands to control the ultrasound robot while simultaneously providing empathetic, transparent verbal explanations of procedures to patients. Experimental evaluation demonstrates that IVS significantly improves procedural efficiency, clinician–patient trust, and patient experience. By bridging semantic misalignment in human–robot–patient collaborative diagnosis and intervention, IVS establishes a foundational technical framework for context-aware, linguistically grounded medical robotics in XR.

Technology Category

Application Category

📝 Abstract
The advancement and maturity of large language models (LLMs) and robotics have unlocked vast potential for human-computer interaction, particularly in the field of robotic ultrasound. While existing research primarily focuses on either patient-robot or physician-robot interaction, the role of an intelligent virtual sonographer (IVS) bridging physician-robot-patient communication remains underexplored. This work introduces a conversational virtual agent in Extended Reality (XR) that facilitates real-time interaction between physicians, a robotic ultrasound system(RUS), and patients. The IVS agent communicates with physicians in a professional manner while offering empathetic explanations and reassurance to patients. Furthermore, it actively controls the RUS by executing physician commands and transparently relays these actions to the patient. By integrating LLM-powered dialogue with speech-to-text, text-to-speech, and robotic control, our system enhances the efficiency, clarity, and accessibility of robotic ultrasound acquisition. This work constitutes a first step toward understanding how IVS can bridge communication gaps in physician-robot-patient interaction, providing more control and therefore trust into physician-robot interaction while improving patient experience and acceptance of robotic ultrasound.
Problem

Research questions and friction points this paper is trying to address.

Bridging physician-robot-patient communication gaps in robotic ultrasound
Enhancing efficiency and clarity of robotic ultrasound acquisition
Improving patient experience and trust in robotic ultrasound
Innovation

Methods, ideas, or system contributions that make the work stand out.

XR-based virtual agent for real-time interaction
LLM-powered dialogue with speech and text integration
Robotic ultrasound control via physician commands
🔎 Similar Papers
No similar papers found.
Tianyu Song
Tianyu Song
Technical University of Munich
Augmented RealityRoboticsImage-Guided InterventionsComputer Vision
F
Feng Li
Chair for Computer-Aided Medical Procedures and Augmented Reality, Technical University of Munich, Munich, Germany; Munich Center for Machine Learning, Munich, Germany
Yuan Bi
Yuan Bi
Technical University of Munich
Robotic UltrasoundUltrasound Image Processing
Angelos Karlas
Angelos Karlas
Clinical Resident for Vascular Surgery, Research Group Leader, TUM University Hospital
Vascular SurgeryOptoacousticsVascular BiomechanicsBiosignal ProcessingSurgical Robotics
A
Amir Yousefi
Clinic for Vascular Surgery, Helios Klinikum MĂźnchen West, Munich, Germany
D
Daniela Branzan
Department for Vascular and Endovascular Surgery, Rechts der Isar University Hospital, Technical University of Munich, Munich, Germany
Zhongliang Jiang
Zhongliang Jiang
University of Hong Kong
Medical RoboticsUltrasound imagingRobot learningSurgical RoboticsHuman-robot Interaction
U
Ulrich Eck
Chair for Computer-Aided Medical Procedures and Augmented Reality, Technical University of Munich, Munich, Germany
Nassir Navab
Nassir Navab
Professor of Computer Science, Technische Universität Mßnchen