🤖 AI Summary
To address inaccurate identification of anatomical landmarks and non-compliant measurements in echocardiographic left ventricular (LV) assessment, this paper proposes the first clinical-guideline-aware segmentation–keypoint joint optimization framework. Methodologically, we build a dual-branch architecture upon the Segment Anything Model (SAM): one branch performs endocardial segmentation, while the other regresses keypoint heatmaps. We introduce two novel modules: (i) Frequency-domain Filtered Cross-Attention (FCBA) to enhance heatmap feature representation, and (ii) Spatial-Guided Prompt Alignment (SGPA), which leverages anatomical priors to automatically generate high-quality prompt embeddings for synergistic segmentation and keypoint localization. Evaluated on real-world echocardiography data, our method achieves significant improvements: +3.2% Dice score for endocardial segmentation, 21.7% reduction in keypoint localization error, and 98.4% consistency in clinically critical metrics—including LV volume and ejection fraction—surpassing current state-of-the-art approaches.
📝 Abstract
Left ventricular (LV) indicator measurements following clinical echocardiog-raphy guidelines are important for diagnosing cardiovascular disease. Alt-hough existing algorithms have explored automated LV quantification, they can struggle to capture generic visual representations due to the normally small training datasets. Therefore, it is necessary to introduce vision founda-tional models (VFM) with abundant knowledge. However, VFMs represented by the segment anything model (SAM) are usually suitable for segmentation but incapable of identifying key anatomical points, which are critical in LV indicator measurements. In this paper, we propose a novel framework named AutoSAME, combining the powerful visual understanding of SAM with seg-mentation and landmark localization tasks simultaneously. Consequently, the framework mimics the operation of cardiac sonographers, achieving LV indi-cator measurements consistent with clinical guidelines. We further present fil-tered cross-branch attention (FCBA) in AutoSAME, which leverages relatively comprehensive features in the segmentation to enhance the heatmap regression (HR) of key points from the frequency domain perspective, optimizing the vis-ual representation learned by the latter. Moreover, we propose spatial-guided prompt alignment (SGPA) to automatically generate prompt embeddings guid-ed by spatial properties of LV, thereby improving the accuracy of dense pre-dictions by prior spatial knowledge. The extensive experiments on an echocar-diography dataset demonstrate the efficiency of each design and the superiori-ty of our AutoSAME in LV segmentation, landmark localization, and indicator measurements. The code will be available at https://github.com/QC-LIU-1997/AutoSAME.