π€ AI Summary
Self-supervised learning often amplifies bias related to sensitive attributes in learned representations, compromising fairness. To address this issue, this work proposes ProtoFairβa plug-and-play fair contrastive loss that enhances representation fairness without modifying existing self-supervised frameworks, requiring only sensitive attribute labels. The key innovation lies in leveraging unsupervised prototype clustering to construct pseudo-counterfactual sample pairs, introducing for the first time such a mechanism into self-supervised learning to promote invariance to sensitive attributes. ProtoFair seamlessly integrates with mainstream methods like SimCLR and SupCon, significantly improving fairness metrics on benchmarks such as CelebA and UTKFace while maintaining competitive downstream task accuracy.
π Abstract
Self-supervised learning methods learn high-quality visual representations, yet recent studies show that these representations often capture demographic biases present in the training data. Existing fairness-aware methods address this by redesigning the self-supervised objective itself, limiting portability across the rapidly evolving landscape of self-supervised learning (SSL) frameworks. We propose ProtoFair, a fairness-aware contrastive loss designed to work alongside existing SSL objectives without modifying them. ProtoFair leverages unsupervised prototype clustering to identify pseudo-counterfactual pairs: samples sharing the same cluster assignment but belonging to different sensitive groups. By pulling these content-matched, cross-group samples together in the embedding space, ProtoFair encourages the encoder to learn representations that are invariant to the sensitive attribute. The method requires only sensitive attribute annotations, no target labels, and integrates seamlessly with both SimCLR and SupCon. Experiments on CelebA and UTKFace demonstrate consistent fairness improvements while maintaining competitive accuracy.