🤖 AI Summary
Deepfake detection models exhibit demographic bias—particularly across gender and race—leading to systematic misclassifications and exacerbating digital inequity; existing fairness-enhancement methods often compromise detection accuracy. To address this, we propose a dual-mechanism co-optimization framework that innovatively integrates *sensitive-channel disentanglement* at the model architecture level (to decouple bias) with *inter-class distribution alignment* at the feature level (to foster globally fair representations), jointly improving both cross-group and intra-group fairness. Extensive experiments on multi-domain benchmarks demonstrate that our method maintains state-of-the-art detection performance (AUC > 98.5%) while significantly enhancing fairness: average equalized odds and demographic parity disparities (ΔEO/ΔDP) decrease by 37.2%, and intra-group variance drops by 29.6%. To the best of our knowledge, this is the first approach to achieve simultaneous high accuracy and strong fairness in deepfake detection.
📝 Abstract
Fairness is a core element in the trustworthy deployment of deepfake detection models, especially in the field of digital identity security. Biases in detection models toward different demographic groups, such as gender and race, may lead to systemic misjudgments, exacerbating the digital divide and social inequities. However, current fairness-enhanced detectors often improve fairness at the cost of detection accuracy. To address this challenge, we propose a dual-mechanism collaborative optimization framework. Our proposed method innovatively integrates structural fairness decoupling and global distribution alignment: decoupling channels sensitive to demographic groups at the model architectural level, and subsequently reducing the distance between the overall sample distribution and the distributions corresponding to each demographic group at the feature level. Experimental results demonstrate that, compared with other methods, our framework improves both inter-group and intra-group fairness while maintaining overall detection accuracy across domains.