Fair-FLIP: Fair Deepfake Detection with Fairness-Oriented Final Layer Input Prioritising

📅 2025-07-11

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Deepfake detection models frequently exhibit significant demographic bias—particularly across race and gender subgroups—undermining fairness. To address this, we propose Fair-FLIP, a plug-and-play post-training fairness optimization method. It identifies high- and low-variance samples per subgroup via subgroup variance analysis and dynamically reweights inputs to the final classification layer, thereby suppressing bias amplification and enhancing robust subgroup representations. Fair-FLIP requires no architectural modification or model retraining and is compatible with mainstream deepfake detectors. Extensive experiments across multiple benchmark datasets demonstrate that Fair-FLIP incurs only a marginal 0.25% drop in overall detection accuracy while improving key fairness metrics—including equal opportunity difference and mean absolute error difference—by up to 30%. To our knowledge, this is the first approach to systematically mitigate group-level unfairness in deepfake detection without compromising strong detection performance.

Technology Category

Application Category

📝 Abstract

Artificial Intelligence-generated content has become increasingly popular, yet its malicious use, particularly the deepfakes, poses a serious threat to public trust and discourse. While deepfake detection methods achieve high predictive performance, they often exhibit biases across demographic attributes such as ethnicity and gender. In this work, we tackle the challenge of fair deepfake detection, aiming to mitigate these biases while maintaining robust detection capabilities. To this end, we propose a novel post-processing approach, referred to as Fairness-Oriented Final Layer Input Prioritising (Fair-FLIP), that reweights a trained model's final-layer inputs to reduce subgroup disparities, prioritising those with low variability while demoting highly variable ones. Experimental results comparing Fair-FLIP to both the baseline (without fairness-oriented de-biasing) and state-of-the-art approaches show that Fair-FLIP can enhance fairness metrics by up to 30% while maintaining baseline accuracy, with only a negligible reduction of 0.25%. Code is available on Github: https://github.com/szandala/fair-deepfake-detection-toolbox

Problem

Research questions and friction points this paper is trying to address.

Mitigate biases in deepfake detection across demographics

Maintain robust detection while enhancing fairness metrics

Reduce subgroup disparities with Fair-FLIP post-processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fair-FLIP reweights final-layer inputs for fairness

Reduces subgroup disparities with low variability prioritization

Maintains accuracy while enhancing fairness by 30%

🔎 Similar Papers

No similar papers found.