🤖 AI Summary
This work addresses racial bias—particularly performance disparities between Black and White subjects—in AI-driven cardiac magnetic resonance (CMR) image segmentation. We systematically evaluate the efficacy of mainstream bias mitigation strategies, including oversampling, importance reweighting, and Group Distributionally Robust Optimization (Group DRO), both individually and in combination. Notably, we introduce image cropping as a novel complementary augmentation technique. Results demonstrate that oversampling significantly improves segmentation accuracy for Black subjects (Dice score +3.2%) without degrading performance on White subjects. Image cropping alone reduces bias, and its integration with oversampling yields the strongest fairness improvement—reducing inter-group Dice disparity by 57%. To our knowledge, this is the first comprehensive empirical study of bias mitigation in CMR segmentation. Our findings provide a reproducible methodological framework and empirical evidence to advance fairness in medical AI.
📝 Abstract
Artificial intelligence (AI) is increasingly being used for medical imaging tasks. However, there can be biases in the resulting models, particularly when they were trained using imbalanced training datasets. One such example has been the strong race bias effect in cardiac magnetic resonance (CMR) image segmentation models. Although this phenomenon has been reported in a number of publications, little is known about the effectiveness of bias mitigation algorithms in this domain. We aim to investigate the impact of common bias mitigation methods to address bias between Black and White subjects in AI-based CMR segmentation models. Specifically, we use oversampling, importance reweighing and Group DRO as well as combinations of these techniques to mitigate the race bias. Furthermore, motivated by recent findings on the root causes of AI-based CMR segmentation bias, we evaluate the same methods using models trained and evaluated on cropped CMR images. We find that bias can be mitigated using oversampling, significantly improving performance for the underrepresented Black subjects whilst not significantly reducing the majority White subjects' performance. Group DRO also improves performance for Black subjects but not significantly, while reweighing decreases performance for Black subjects. Using a combination of oversampling and Group DRO also improves performance for Black subjects but not significantly. Using cropped images increases performance for both races and reduces the bias, whilst adding oversampling as a bias mitigation technique with cropped images reduces the bias further.