Fair Deepfake Detectors Can Generalize

📅 2025-07-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deepfake detection faces a fundamental challenge in simultaneously achieving generalization (to unseen manipulation types) and fairness (across demographic groups), with existing methods often treating them as mutually exclusive objectives. This paper formally uncovers the causal relationship between generalization and fairness, proposing a theoretical framework grounded in backdoor adjustment. We introduce DAID—a plug-and-play framework that jointly optimizes both goals via inverse propensity weighting, subgroup-wise feature normalization, sensitive-attribute-invariant feature aggregation, and a novel alignment loss—explicitly disentangling distributional bias from demographic signals. Evaluated on three cross-domain benchmarks, DAID consistently outperforms state-of-the-art methods, improving both generalization accuracy and fairness metrics. Our results validate the effectiveness of integrating causal modeling principles with principled architectural design for robust and equitable deepfake detection.

Technology Category

Application Category

📝 Abstract
Deepfake detection models face two critical challenges: generalization to unseen manipulations and demographic fairness among population groups. However, existing approaches often demonstrate that these two objectives are inherently conflicting, revealing a trade-off between them. In this paper, we, for the first time, uncover and formally define a causal relationship between fairness and generalization. Building on the back-door adjustment, we show that controlling for confounders (data distribution and model capacity) enables improved generalization via fairness interventions. Motivated by this insight, we propose Demographic Attribute-insensitive Intervention Detection (DAID), a plug-and-play framework composed of: i) Demographic-aware data rebalancing, which employs inverse-propensity weighting and subgroup-wise feature normalization to neutralize distributional biases; and ii) Demographic-agnostic feature aggregation, which uses a novel alignment loss to suppress sensitive-attribute signals. Across three cross-domain benchmarks, DAID consistently achieves superior performance in both fairness and generalization compared to several state-of-the-art detectors, validating both its theoretical foundation and practical effectiveness.
Problem

Research questions and friction points this paper is trying to address.

Improving generalization of deepfake detectors to unseen manipulations
Ensuring demographic fairness in deepfake detection models
Resolving trade-off between fairness and generalization via causal analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Demographic-aware data rebalancing with inverse-propensity weighting
Subgroup-wise feature normalization to neutralize biases
Demographic-agnostic feature aggregation using alignment loss
🔎 Similar Papers
No similar papers found.
Harry Cheng
Harry Cheng
National University of Singapore
Diffusion ModelMLLM SecurityDeepfake Detection
Ming-Hui Liu
Ming-Hui Liu
Shandong University
Deepfake Detection
Y
Yangyang Guo
National University of Singapore
T
Tianyi Wang
National University of Singapore
L
Liqiang Nie
Harbin Institute of Technology (Shenzhen)
M
Mohan Kankanhalli
National University of Singapore