AI-Driven MRI-based Brain Tumour Segmentation Benchmarking

📅 2025-06-25

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Current promptable medical image segmentation models lack systematic evaluation across varying prompt quality levels. Method: We conduct the first zero-shot benchmark on the BraTS 2023 multimodal MRI dataset, evaluating SAM, SAM 2, MedSAM, SAM-Med-3D, and nnU-Net. We compare point prompts against high-precision bounding box prompts and introduce pediatric tumor data for fine-tuning to enhance point-prompt performance. Contribution/Results: Under bounding box prompts, SAM and SAM 2 achieve Dice scores of 0.894 and 0.893—surpassing nnU-Net. Fine-tuning on pediatric oncology data significantly improves point-prompt accuracy. Our analysis demonstrates that prompt quality critically governs model performance, affirming the viability of general-purpose vision foundation models in medical segmentation. This work provides empirical evidence and methodological guidance for prompt engineering in clinical imaging analysis.

Technology Category

Application Category

📝 Abstract

Medical image segmentation has greatly aided medical diagnosis, with U-Net based architectures and nnU-Net providing state-of-the-art performance. There have been numerous general promptable models and medical variations introduced in recent years, but there is currently a lack of evaluation and comparison of these models across a variety of prompt qualities on a common medical dataset. This research uses Segment Anything Model (SAM), Segment Anything Model 2 (SAM 2), MedSAM, SAM-Med-3D, and nnU-Net to obtain zero-shot inference on the BraTS 2023 adult glioma and pediatrics dataset across multiple prompt qualities for both points and bounding boxes. Several of these models exhibit promising Dice scores, particularly SAM and SAM 2 achieving scores of up to 0.894 and 0.893, respectively when given extremely accurate bounding box prompts which exceeds nnU-Net's segmentation performance. However, nnU-Net remains the dominant medical image segmentation network due to the impracticality of providing highly accurate prompts to the models. The model and prompt evaluation, as well as the comparison, are extended through fine-tuning SAM, SAM 2, MedSAM, and SAM-Med-3D on the pediatrics dataset. The improvements in point prompt performance after fine-tuning are substantial and show promise for future investigation, but are unable to achieve better segmentation than bounding boxes or nnU-Net.

Problem

Research questions and friction points this paper is trying to address.

Evaluates AI models for MRI brain tumor segmentation

Compares prompt-based models across different prompt qualities

Assesses fine-tuning impact on segmentation performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmarks SAM models on BraTS 2023 dataset

Evaluates zero-shot inference with varied prompts

Fine-tunes models for pediatric data improvement

🔎 Similar Papers

An Integrated Deep Learning Framework for Effective Brain Tumor Localization, Segmentation, and Classification from Magnetic Resonance Images