🤖 AI Summary
This work addresses the challenges of segmenting heterogeneous brain tumor subregions—necrotic core, peritumoral edema, and enhancing tumor—in multiparametric MRI, which arise from their highly variable morphology, severe class imbalance, and overlapping appearances. To tackle these issues, the authors propose SegGuidedNet, a 3D residual U-Net architecture featuring a novel SegAttentionGate module. This module generates spatially discriminative attention maps for each subregion within the decoder and is explicitly supervised by a lightweight auxiliary loss. The design significantly enhances the model’s ability to distinguish visually ambiguous classes with negligible parameter overhead and provides built-in spatial interpretability during inference. Evaluated on the BraTS2021 and BraTS2023 GLI test sets, SegGuidedNet achieves average Dice scores of 0.905 and 0.897, respectively, surpassing single-model performances of nnU-Net and HNF-Netv2 while approaching the accuracy of a ten-model Swin UNETR ensemble at substantially lower computational cost.
📝 Abstract
Accurate segmentation of brain tumour sub-regions from multi-parametric MRI is critical for treatment planning yet remains challenging due to morphological variability, class imbalance, and overlapping appearances of tumour regions across imaging sequences. We propose SegGuidedNet, a three-dimensional residual encoder--decoder network introducing a novel SegAttentionGate module that explicitly supervises the decoder to produce spatially discriminative attention maps for each tumour sub-region necrotic core, peritumoral oedema, and enhancing tumour via a lightweight auxiliary loss, adding less than 0.2% parameter overhead. This sub-region supervision maintains decoder discriminability between visually ambiguous classes while providing free-of-cost spatial interpretability at inference without any post-hoc explanation method. Evaluated independently on BraTS2021 and BraTS2023 GLI across 251 held-out subjects each, SegGuidedNet achieves mean Dice of 0.905 (ET= 0.873, TC=0.906, WT=0.935) and 0.897 (ET=0.859, TC=0.902, WT=0.931) respectively, surpassing ensemble-based nnU-Net and HNF-Netv2 as a single model and approaching Swin UNETR a 10-model ensemble within 2--4 Dice points at a fraction of the inference cost. The consistency of results across two benchmark editions further confirms the generalisability of the proposed approach, offering competitive accuracy with built-in interpretability in a lightweight, clinically practical framework.