🤖 AI Summary
To address the limitations of 2D models—neglecting inter-slice continuity—and 3D models—requiring extensive annotated data—in T2-weighted MRI prostate gland segmentation, this paper proposes a spatiotemporal hybrid network. The method leverages a pretrained DeepLabV3 to extract slice-wise semantic features and incorporates ConvLSTM to explicitly model anatomical temporal dependencies across adjacent slices, treating MRI volumes as spatiotemporal sequences for the first time. This design jointly preserves strong 2D representational capacity and enforces 3D spatial consistency, significantly enhancing robustness under limited annotations and low-contrast conditions. Evaluated on the PROMISE12 benchmark, the approach achieves superior IoU and Dice scores compared to state-of-the-art 2D and 3D methods, demonstrating its efficacy and clinical deployability.
📝 Abstract
Prostate gland segmentation from T2-weighted MRI is a critical yet challenging task in clinical prostate cancer assessment. While deep learning-based methods have significantly advanced automated segmentation, most conventional approaches-particularly 2D convolutional neural networks (CNNs)-fail to leverage inter-slice anatomical continuity, limiting their accuracy and robustness. Fully 3D models offer improved spatial coherence but require large amounts of annotated data, which is often impractical in clinical settings. To address these limitations, we propose a hybrid architecture that models MRI sequences as spatiotemporal data. Our method uses a deep, pretrained DeepLabV3 backbone to extract high-level semantic features from each MRI slice and a recurrent convolutional head, built with ConvLSTM layers, to integrate information across slices while preserving spatial structure. This combination enables context-aware segmentation with improved consistency, particularly in data-limited and noisy imaging conditions. We evaluate our method on the PROMISE12 benchmark under both clean and contrast-degraded test settings. Compared to state-of-the-art 2D and 3D segmentation models, our approach demonstrates superior performance in terms of precision, recall, Intersection over Union (IoU), and Dice Similarity Coefficient (DSC), highlighting its potential for robust clinical deployment.