🤖 AI Summary
Cardiac MRI data exhibit severe distributional bias and class imbalance due to sensitive attributes—including sex, age, BMI, and health status—hindering fair and robust model training.
Method: This work introduces, for the first time, a ControlNet-augmented latent diffusion model (LDM) for medical image debiasing. It enables fine-grained, sensitive-attribute-controllable synthesis by jointly conditioning on patient metadata and geometry-aware encodings derived from cardiac segmentation masks. The entire pipeline is trained end-to-end on a single consumer-grade GPU.
Contribution/Results: Synthesized data substantially improve coverage of underrepresented subgroups—particularly young patients and heart failure patients with normal BMI. Downstream classifiers achieve a 12.7% absolute accuracy gain on minority groups and reduce inter-group performance disparity by 38%. With an FID of 14.2, generated images meet clinical usability standards, demonstrating both fairness enhancement and practical deployability.
📝 Abstract
The progress in deep learning solutions for disease diagnosis and prognosis based on cardiac magnetic resonance imaging is hindered by highly imbalanced and biased training data. To address this issue, we propose a method to alleviate imbalances inherent in datasets through the generation of synthetic data based on sensitive attributes such as sex, age, body mass index, and health condition. We adopt ControlNet based on a denoising diffusion probabilistic model to condition on text assembled from patient metadata and cardiac geometry derived from segmentation masks using a large-cohort study, specifically, the UK Biobank. We assess our method by evaluating the realism of the generated images using established quantitative metrics. Furthermore, we conduct a downstream classification task aimed at debiasing a classifier by rectifying imbalances within underrepresented groups through synthetically generated samples. Our experiments demonstrate the effectiveness of the proposed approach in mitigating dataset imbalances, such as the scarcity of younger patients or individuals with normal BMI level suffering from heart failure. This work represents a major step towards the adoption of synthetic data for the development of fair and generalizable models for medical classification tasks. Notably, we conduct all our experiments using a single, consumer-level GPU to highlight the feasibility of our approach within resource-constrained environments. Our code is available at https://github.com/faildeny/debiasing-cardiac-mri.