The MAMA-MIA Challenge: Advancing Generalizability and Fairness in Breast MRI Tumor Segmentation and Treatment Response Prediction

📅 2026-03-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limited generalizability and insufficient fairness of existing AI models for breast MRI, which are predominantly trained on single-center data. To overcome these limitations, we established the first intercontinental, multicenter benchmark dataset for breast MRI and, within an international challenge, jointly evaluated the generalization performance and fairness of models across two critical tasks: tumor segmentation and prediction of pathological complete response after neoadjuvant chemotherapy. Through standardized preprocessing, independent external testing, and a unified scoring framework that integrates overall performance with subgroup fairness—stratified by age, menopausal status, and breast density—we engaged 26 international teams. Our findings reveal a significant performance drop on external test sets and a trade-off between accuracy and fairness, thereby providing a crucial benchmark and empirical foundation for developing robust and equitable AI systems in breast cancer imaging.

Technology Category

Application Category

📝 Abstract
Breast cancer is the most frequently diagnosed malignancy among women worldwide and a leading cause of cancer-related mortality. Dynamic contrast-enhanced magnetic resonance imaging plays a central role in tumor characterization and treatment monitoring, particularly in patients receiving neoadjuvant chemotherapy. However, existing artificial intelligence models for breast magnetic resonance imaging are often developed using single-center data and evaluated using aggregate performance metrics, limiting their generalizability and obscuring potential performance disparities across demographic subgroups. The MAMA-MIA Challenge was designed to address these limitations by introducing a large-scale benchmark that jointly evaluates primary tumor segmentation and prediction of pathologic complete response using pre-treatment magnetic resonance imaging only. The training cohort comprised 1,506 patients from multiple institutions in the United States, while evaluation was conducted on an external test set of 574 patients from three independent European centers to assess cross-continental and cross-institutional generalization. A unified scoring framework combined predictive performance with subgroup consistency across age, menopausal status, and breast density. Twenty-six international teams participated in the final evaluation phase. Results demonstrate substantial performance variability under external testing and reveal trade-offs between overall accuracy and subgroup fairness. The challenge provides standardized datasets, evaluation protocols, and public resources to promote the development of robust and equitable artificial intelligence systems for breast cancer imaging.
Problem

Research questions and friction points this paper is trying to address.

generalizability
fairness
breast MRI
tumor segmentation
treatment response prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

generalizability
fairness
multi-center benchmark
tumor segmentation
treatment response prediction
🔎 Similar Papers
No similar papers found.
Lidia Garrucho
Lidia Garrucho
Universitat de Barcelona
Artificial IntelligenceMedical Image AnalysisBreast Cancer
Smriti Joshi
Smriti Joshi
Artificial Intelligence in Medicine (BCN-AIM), Universitat de Barcelona
Medical Image AnalysisDeep LearningRadiomicsArtificial Intelligence
Kaisar Kushibar
Kaisar Kushibar
University of Barcelona
Medical Image AnalysisDeep LearningComputer Vision
Richard Osuala
Richard Osuala
University of Barcelona
Medical Image AnalysisGenerative ModelsComputer VisionDeep Learning
Maciej Bobowicz
Maciej Bobowicz
Medical University of Gdansk, Poland
Oncological SurgeryGeneral SurgeryMiniinvasive SurgeryMetabolic Surgery
X
Xavier Bargalló
Department of Radiology, Hospital Clínic of Barcelona, Barcelona, Spain
P
Paulius Jaruševičius
Department of Radiology, Lithuanian University of Health Sciences, Kaunas, Lithuania
K
Kai Geissler
Fraunhofer Institute for Digital Medicine MEVIS, Germany
R
Raphael Schäfer
Fraunhofer Institute for Digital Medicine MEVIS, Germany
M
Muhammad Alberb
Department of Medical Biophysics, University of Toronto, Canada
Tony Xu
Tony Xu
University of Toronto
Computer VisionMedical ImagingDeep Learning
Anne Martel
Anne Martel
Medical Biophysics, University of Toronto. Physical Sciences, Sunnybrook Research Institute
Medical ImagingMachine learning
D
Daniel Sleiman
University of Amsterdam, Amsterdam, The Netherlands
Navchetan Awasthi
Navchetan Awasthi
Assistant Professor, University of Amsterdam
Inverse problemsMedical Image AnalysisBiomedical OpticsPhotoacoustic imagingDeep Learning
H
Hadeel Awwad
Computer Vision and Robotics Institute (ViCOROB), University of Girona, Girona, Spain
J
Joan C. Vilanova
Department of Radiology, Clínica Girona, Institute of Diagnostic Imaging (IDI), Girona, Spain
R
Robert Martí
Computer Vision and Robotics Institute (ViCOROB), University of Girona, Girona, Spain
D
Daan Schouten
Stanford University, Stanford, CA, USA
J
Jeong Hoon Lee
Stanford University, Stanford, CA, USA
Mirabela Rusu
Mirabela Rusu
Assistant Professor of Radiology at Stanford University
multi-protocolmulti-scale data fusionMRIHistologycomputational imaging
E
Eleonora Poeta
Politecnico di Torino, Turin, Italy
L
Luisa Vargas
EURECOM, France
Eliana Pastor
Eliana Pastor
Politecnico di Torino
Explainable AITrustworthy AIFairness in AI
Maria A. Zuluaga
Maria A. Zuluaga
EURECOM
J
Jessica Kächele
German Cancer Research Center (DKFZ), Division of Medical Image Computing, and the Medical Faculty Heidelberg, Heidelberg University, Germany