π€ AI Summary
To address the challenge of imprecise tumor boundary delineation in glioma MRI due to missing multi-modal scans, this paper proposes a novel multi-task framework jointly optimizing tumor segmentation and cross-modal reconstruction. Methodologically, we introduce the Self-Attention Variational Encoder (SAVE) to enhance heterogeneous modality fusion and design the Squeeze-Fusion-Excitation Cross-Attention (SFECA) module to enable task-cooperative optimization between segmentation and reconstruction. The overall architecture integrates Vision XLSTM with a Heterogeneous Variational Encoder-Decoder (HVED), facilitating spatiotemporal deep feature fusion. Evaluated on the BraTS 2024 dataset under single- and dual-modality missing scenarios, our method achieves state-of-the-art performance: a segmentation Dice score improvement of over 3.2% and a 4.7 dB gain in reconstruction PSNR, demonstrating superior robustness and generalizability compared to existing approaches.
π Abstract
Neurogliomas are among the most aggressive forms of cancer, presenting considerable challenges in both treatment and monitoring due to their unpredictable biological behavior. Magnetic resonance imaging (MRI) is currently the preferred method for diagnosing and monitoring gliomas. However, the lack of specific imaging techniques often compromises the accuracy of tumor segmentation during the imaging process. To address this issue, we introduce the XLSTM-HVED model. This model integrates a hetero-modal encoder-decoder framework with the Vision XLSTM module to reconstruct missing MRI modalities. By deeply fusing spatial and temporal features, it enhances tumor segmentation performance. The key innovation of our approach is the Self-Attention Variational Encoder (SAVE) module, which improves the integration of modal features. Additionally, it optimizes the interaction of features between segmentation and reconstruction tasks through the Squeeze-Fusion-Excitation Cross Awareness (SFECA) module. Our experiments using the BraTS 2024 dataset demonstrate that our model significantly outperforms existing advanced methods in handling cases where modalities are missing. Our source code is available at https://github.com/Quanato607/XLSTM-HVED.