🤖 AI Summary
CLIP exhibits demographic biases—such as those related to race and gender—in medical image analysis, leading to unfair diagnostic outcomes and degraded performance for underrepresented subgroups. To address this, we propose FairCLIP, a fairness-enhanced contrastive language–image pretraining framework tailored for chest X-ray analysis. Its core innovation is a sensitive-attribute adversarial intervention mechanism that explicitly disentangles sensitive features within the joint language–image embedding space, thereby mitigating spurious correlations. FairCLIP integrates contrastive learning, cross-modal alignment, and decorrelation constraints to support fair zero-shot and few-shot generalization. Evaluated on a multi-center chest X-ray dataset, FairCLIP achieves significant improvements across all demographic subgroups: average classification accuracy increases by 3.2%, and Equalized Odds difference decreases by 58%. Crucially, it preserves strong generalization capability. This work establishes a new benchmark for fairness-aware learning in medical vision foundation models.
📝 Abstract
Contrastive Language-Image Pre-training (CLIP) models have demonstrated superior performance across various visual tasks including medical image classification. However, fairness concerns, including demographic biases, have received limited attention for CLIP models. This oversight leads to critical issues, particularly those related to race and gender, resulting in disparities in diagnostic outcomes and reduced reliability for underrepresented groups. To address these challenges, we introduce AdFair-CLIP, a novel framework employing adversarial feature intervention to suppress sensitive attributes, thereby mitigating spurious correlations and improving prediction fairness. We conduct comprehensive experiments on chest X-ray (CXR) datasets, and show that AdFair-CLIP significantly enhances both fairness and diagnostic accuracy, while maintaining robust generalization in zero-shot and few-shot scenarios. These results establish new benchmarks for fairness-aware learning in CLIP-based medical diagnostic models, particularly for CXR analysis.