🤖 AI Summary
Hierarchical structure preservation in multi-sequence MRI brain tumor segmentation remains challenging due to tissue heterogeneity, ill-defined boundaries, and inter-modal intensity variations. Method: We propose the Unified Multi-modal Coherent Field (UMCF) framework, which jointly fuses visual, semantic, and spatial information within a 3D latent space—departing from conventional “process-then-concatenate” paradigms. UMCF incorporates a parameter-free uncertainty gating mechanism for adaptive modality weighting and embeds clinical priors into attention computation to enhance interpretability and generalizability. Built upon nnU-Net, it integrates medical-prior-guided training. Results: On BraTS 2020 and 2021, UMCF+nnU-Net achieves mean Dice scores of 0.8579 and 0.8977, respectively—surpassing state-of-the-art methods by 4.18% on average. It notably improves boundary delineation and hierarchical consistency across whole tumor (WT), tumor core (TC), and enhancing tumor (ET) subregions.
📝 Abstract
Brain tumor segmentation requires accurate identification of hierarchical regions including whole tumor (WT), tumor core (TC), and enhancing tumor (ET) from multi-sequence magnetic resonance imaging (MRI) images. Due to tumor tissue heterogeneity, ambiguous boundaries, and contrast variations across MRI sequences, methods relying solely on visual information or post-hoc loss constraints show unstable performance in boundary delineation and hierarchy preservation. To address this challenge, we propose the Unified Multimodal Coherent Field (UMCF) method. This method achieves synchronous interactive fusion of visual, semantic, and spatial information within a unified 3D latent space, adaptively adjusting modal contributions through parameter-free uncertainty gating, with medical prior knowledge directly participating in attention computation, avoiding the traditional "process-then-concatenate" separated architecture. On Brain Tumor Segmentation (BraTS) 2020 and 2021 datasets, UMCF+nnU-Net achieves average Dice coefficients of 0.8579 and 0.8977 respectively, with an average 4.18% improvement across mainstream architectures. By deeply integrating clinical knowledge with imaging features, UMCF provides a new technical pathway for multimodal information fusion in precision medicine.