đ€ AI Summary
Deep learning models for tumor classification in histopathological images suffer from poor generalization to underrepresented subpopulations due to biases introduced by staining protocols, scanning devices, hospitals, and demographic variationsâleading to shortcut learning and prediction disparities. To address this, we propose a metadata-guided diffusion generative framework that explicitly incorporates clinical metadata (e.g., stain type, institution, demographic attributes) into the conditional diffusion model architectureâenabling zero-shot, high-fidelity synthesis of histopathological images across diverse subpopulations. Leveraging TCGA pretraining and fine-grained metadata alignment, our method generates high-quality images for unseen subpopulations to debias downstream classifiers. Experiments demonstrate that classifiers trained on our synthesized data achieve an average accuracy gain of 8.3% and a 62% reduction in Equalized Odds difference on subpopulation-shifted test setsâsignificantly outperforming conventional data augmentation and robust training baselines.
đ Abstract
Deep learning models have made significant advances in histological prediction tasks in recent years. However, for adaptation in clinical practice, their lack of robustness to varying conditions such as staining, scanner, hospital, and demographics is still a limiting factor: if trained on overrepresented subpopulations, models regularly struggle with less frequent patterns, leading to shortcut learning and biased predictions. Large-scale foundation models have not fully eliminated this issue. Therefore, we propose a novel approach explicitly modeling such metadata into a Metadata-guided generative Diffusion model framework (MeDi). MeDi allows for a targeted augmentation of underrepresented subpopulations with synthetic data, which balances limited training data and mitigates biases in downstream models. We experimentally show that MeDi generates high-quality histopathology images for unseen subpopulations in TCGA, boosts the overall fidelity of the generated images, and enables improvements in performance for downstream classifiers on datasets with subpopulation shifts. Our work is a proof-of-concept towards better mitigating data biases with generative models.