🤖 AI Summary
Existing robotic ultrasound systems often rely on rule-based or black-box models, lacking interpretability and thus limiting their clinical adoption. This work proposes the first interpretable framework that integrates retrieval-augmented generation (RAG) with robotic control strategies to enable autonomous carotid ultrasound scanning aligned with clinical workflows, while providing real-time semantic explanations of the current phase and the next action. The approach significantly enhances system interpretability and generalization, reducing reliance on large-scale annotated datasets. Trained on data from 28 volunteers, the system successfully performed fully autonomous transverse and longitudinal scans on four previously unseen subjects, demonstrating its effectiveness and clinical potential.
📝 Abstract
Robotic ultrasound (US) has recently attracted increasing attention as a means to overcome the limitations of conventional US examinations, such as the strong operator dependence. However, the decision-making process of existing methods is often either rule-based or relies on end-to-end learning models that operate as black boxes. This has been seen as a main limit for clinical acceptance and raises safety concerns for widespread adoption in routine practice. To tackle this challenge, we introduce the RAG-RUSS, an interpretable framework capable of performing a full carotid examination in accordance with the clinical workflow while explicitly explaining both the current stage and the next planned action. Furthermore, given the scarcity of medical data, we incorporate retrieval-augmented generation to enhance generalization and reduce dependence on large-scale training datasets. The method was trained on data acquired from 28 volunteers, while an additional four volumetric scans recorded from previously unseen volunteers were reserved for testing. The results demonstrate that the method can explain the current scanning stage and autonomously plan probe motions to complete the carotid examination, encompassing both transverse and longitudinal planes.