🤖 AI Summary
Medical image generalization across multi-center clinical sites is hindered by distribution shifts arising from heterogeneous imaging protocols, scanner hardware, and operator practices—exacerbated by scarce annotated data, impeding deep learning deployment. To address this, we propose a semantic-guided, domain-directed data augmentation framework embedded within the Invariant Risk Minimization (IRM) paradigm, jointly enforcing semantic alignment and distributional discrepancy reduction. Our key contribution is the first introduction of a cross-domain covariance-guided augmentation direction selection mechanism—replacing conventional random augmentation—to enhance IRM’s robustness in medical imaging. The method integrates cross-domain covariance modeling, semantic-aware augmentation, and multi-center representation learning. Evaluated on a multi-center diabetic retinopathy dataset under challenging conditions (few-shot setting: <100 samples per site; high inter-site heterogeneity), our approach achieves a 5.2% absolute accuracy improvement over state-of-the-art methods.
📝 Abstract
Deep learning has achieved remarkable success in medical image classification. However, its clinical application is often hindered by data heterogeneity caused by variations in scanner vendors, imaging protocols, and operators. Approaches such as invariant risk minimization (IRM) aim to address this challenge of out-of-distribution generalization. For instance, VIRM improves upon IRM by tackling the issue of insufficient feature support overlap, demonstrating promising potential. Nonetheless, these methods face limitations in medical imaging due to the scarcity of annotated data and the inefficiency of augmentation strategies. To address these issues, we propose a novel domain-oriented direction selector to replace the random augmentation strategy used in VIRM. Our method leverages inter-domain covariance as a guider for augmentation direction, guiding data augmentation towards the target domain. This approach effectively reduces domain discrepancies and enhances generalization performance. Experiments on a multi-center diabetic retinopathy dataset demonstrate that our method outperforms state-of-the-art approaches, particularly under limited data conditions and significant domain heterogeneity.