🤖 AI Summary
This study addresses the bottleneck of cardiac ultrasound view classification, which heavily relies on large-scale annotated data, by conducting a fair comparison of two self-supervised learning methods—USF-MAE and MoCo v3—on the CACTUS dataset. Under a unified training protocol (learning rate 1e⁻⁴, weight decay 0.01) and five-fold cross-validation, the generalization performance of both models is systematically evaluated across a six-class view classification task. Results demonstrate that USF-MAE significantly outperforms MoCo v3, achieving an average AUC of 99.99% and accuracy of 99.33% (p = 0.0048), with statistically superior performance across all metrics including AUC, accuracy, F1 score, and recall. These findings validate USF-MAE’s enhanced discriminative feature representation and establish it as an effective new paradigm for self-supervised learning in medical imaging.
📝 Abstract
Reliable interpretation of cardiac ultrasound images is essential for accurate clinical diagnosis and assessment. Self-supervised learning has shown promise in medical imaging by leveraging large unlabelled datasets to learn meaningful representations. In this study, we evaluate and compare two self-supervised learning frameworks, USF-MAE, developed by our team, and MoCo v3, on the recently introduced CACTUS dataset (37,736 images) for automated simulated cardiac view (A4C, PL, PSAV, PSMV, Random, and SC) classification. Both models used 5-fold cross-validation, enabling robust assessment of generalization performance across multiple random splits. The CACTUS dataset provides expert-annotated cardiac ultrasound images with diverse views. We adopt an identical training protocol for both models to ensure a fair comparison. Both models are configured with a learning rate of 0.0001 and a weight decay of 0.01. For each fold, we record performance metrics including ROC-AUC, accuracy, F1-score, and recall. Our results indicate that USF-MAE consistently outperforms MoCo v3 across metrics. The average testing AUC for USF-MAE is 99.99% (+/-0.01% 95% CI), compared to 99.97% (+/-0.01%) for MoCo v3. USF-MAE achieves a mean testing accuracy of 99.33% (+/-0.18%), higher than the 98.99% (+/-0.28%) reported for MoCo v3. Similar trends are observed for the F1-score and recall, with improvements statistically significant across folds (paired t-test, p=0.0048 < 0.01). This proof-of-concept analysis suggests that USF-MAE learns more discriminative features for cardiac view classification than MoCo v3 when applied to this dataset. The enhanced performance across multiple metrics highlights the potential of USF-MAE for improving automated cardiac ultrasound classification.