🤖 AI Summary
Fetal ultrasound image classification during the second trimester faces clinical challenges including low image quality, high intra-class variability, and severe class imbalance.
Method: We propose a biologically inspired dual-path deep learning ensemble framework featuring a novel modular two-path architecture—shallow coarse-grained and deep fine-grained pathways—enabling end-to-end joint classification of 16 fetal anatomical structures within a lightweight model. The framework integrates EfficientNet-B0 and EfficientNet-B6 backbones, employs LDAM-Focal loss to mitigate class imbalance, and incorporates Dawid-Skene modeling for multi-expert annotation fusion, trained directly on real-world noisy clinical data (5,298 routine images).
Results: Our method achieves >0.75 accuracy for 90% of anatomical structures and >0.85 for 75%, matching or exceeding the performance of more complex models on fewer-class tasks. This demonstrates robustness, scalability, and clinical applicability in realistic deployment scenarios.
📝 Abstract
Accurate classification of second-trimester fetal ultrasound images remains challenging due to low image quality, high intra-class variability, and significant class imbalance. In this work, we introduce a simple yet powerful, biologically inspired deep learning ensemble framework that-unlike prior studies focused on only a handful of anatomical targets-simultaneously distinguishes 16 fetal structures. Drawing on the hierarchical, modular organization of biological vision systems, our model stacks two complementary branches (a"shallow"path for coarse, low-resolution cues and a"detailed"path for fine, high-resolution features), concatenating their outputs for final prediction. To our knowledge, no existing method has addressed such a large number of classes with a comparably lightweight architecture. We trained and evaluated on 5,298 routinely acquired clinical images (annotated by three experts and reconciled via Dawid-Skene), reflecting real-world noise and variability rather than a"cleaned"dataset. Despite this complexity, our ensemble (EfficientNet-B0 + EfficientNet-B6 with LDAM-Focal loss) identifies 90% of organs with accuracy>0.75 and 75% of organs with accuracy>0.85-performance competitive with more elaborate models applied to far fewer categories. These results demonstrate that biologically inspired modular stacking can yield robust, scalable fetal anatomy recognition in challenging clinical settings.