Who Does Your Algorithm Fail? Investigating Age and Ethnic Bias in the MAMA-MIA Dataset

📅 2025-10-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study identifies, for the first time, systematic age- and race-related fairness disparities in deep learning–based breast cancer image segmentation models evaluated on the MAMA-MIA dataset. Method: We employ hierarchical statistical auditing and multi-source controlled analysis to quantify performance variations across age groups, racial subgroups, and data collection sites. Contribution/Results: We demonstrate statistically significant and robust segmentation performance degradation (p < 0.001) for younger patients—a bias invariant to data source, suggesting inadequate modeling of age-associated physiological features. Furthermore, we show that aggregating multi-center data obscures site-specific racial biases, leading to misleading aggregate fairness assessments. To address this, we propose the first age-sensitivity–aware fairness verification framework for medical image segmentation and characterize the nonlinear impact of multi-source data integration on racial bias propagation. Our findings provide both methodological foundations and empirical evidence to support equitable clinical deployment of AI in oncology imaging.

Technology Category

Application Category

📝 Abstract
Deep learning models aim to improve diagnostic workflows, but fairness evaluation remains underexplored beyond classification, e.g., in image segmentation. Unaddressed segmentation bias can lead to disparities in the quality of care for certain populations, potentially compounded across clinical decision points and amplified through iterative model development. Here, we audit the fairness of the automated segmentation labels provided in the breast cancer tumor segmentation dataset MAMA-MIA. We evaluate automated segmentation quality across age, ethnicity, and data source. Our analysis reveals an intrinsic age-related bias against younger patients that continues to persist even after controlling for confounding factors, such as data source. We hypothesize that this bias may be linked to physiological factors, a known challenge for both radiologists and automated systems. Finally, we show how aggregating data from multiple data sources influences site-specific ethnic biases, underscoring the necessity of investigating data at a granular level.
Problem

Research questions and friction points this paper is trying to address.

Auditing fairness in breast cancer tumor segmentation algorithms
Identifying age and ethnic bias in automated medical image segmentation
Evaluating segmentation quality disparities across patient demographics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Auditing segmentation bias in breast cancer dataset
Revealing age-related bias persists after confounder control
Analyzing multi-source data aggregation impact on ethnic bias
🔎 Similar Papers
No similar papers found.