🤖 AI Summary
Accurate ultrasound image segmentation is critical for fetal biometry, yet existing models—designed for natural images—struggle with the inherent challenges of medical ultrasound, including severe speckle noise, low tissue contrast, and highly jagged boundaries of small anatomical structures. To address these issues, we propose a high-precision segmentation framework tailored for fetal femur and skull ultrasound images. Our method introduces a novel dual-view independent scanning convolution operating along longitudinal and transverse axes, integrated with feature-aware attention and Mamba-enhanced residual blocks to jointly achieve noise-robust local multidimensional modeling and global–local dependency capture. Additionally, we adopt a multi-optimizer collaborative training strategy. Evaluated on multi-scale and multi-orientation ultrasound datasets, our approach significantly suppresses boundary aliasing, accelerates convergence, and achieves state-of-the-art segmentation accuracy—thereby enhancing the reliability of clinical fetal biometric measurements.
📝 Abstract
Accurate ultrasound image segmentation is a prerequisite for precise biometrics and accurate assessment. Relying on manual delineation introduces significant errors and is time-consuming. However, existing segmentation models are designed based on objects in natural scenes, making them difficult to adapt to ultrasound objects with high noise and high similarity. This is particularly evident in small object segmentation, where a pronounced jagged effect occurs. Therefore, this paper proposes a fetal femur and cranial ultrasound image segmentation model based on feature perception and Mamba enhancement to address these challenges. Specifically, a longitudinal and transverse independent viewpoint scanning convolution block and a feature perception module were designed to enhance the ability to capture local detail information and improve the fusion of contextual information. Combined with the Mamba-optimized residual structure, this design suppresses the interference of raw noise and enhances local multi-dimensional scanning. The system builds global information and local feature dependencies, and is trained with a combination of different optimizers to achieve the optimal solution. After extensive experimental validation, the FAMSeg network achieved the fastest loss reduction and the best segmentation performance across images of varying sizes and orientations.