Fair Deepfake Detectors Can Generalize

📅 2025-07-03

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Deepfake detection faces a fundamental challenge in simultaneously achieving generalization (to unseen manipulation types) and fairness (across demographic groups), with existing methods often treating them as mutually exclusive objectives. This paper formally uncovers the causal relationship between generalization and fairness, proposing a theoretical framework grounded in backdoor adjustment. We introduce DAID—a plug-and-play framework that jointly optimizes both goals via inverse propensity weighting, subgroup-wise feature normalization, sensitive-attribute-invariant feature aggregation, and a novel alignment loss—explicitly disentangling distributional bias from demographic signals. Evaluated on three cross-domain benchmarks, DAID consistently outperforms state-of-the-art methods, improving both generalization accuracy and fairness metrics. Our results validate the effectiveness of integrating causal modeling principles with principled architectural design for robust and equitable deepfake detection.

Technology Category

Application Category

📝 Abstract

Deepfake detection models face two critical challenges: generalization to unseen manipulations and demographic fairness among population groups. However, existing approaches often demonstrate that these two objectives are inherently conflicting, revealing a trade-off between them. In this paper, we, for the first time, uncover and formally define a causal relationship between fairness and generalization. Building on the back-door adjustment, we show that controlling for confounders (data distribution and model capacity) enables improved generalization via fairness interventions. Motivated by this insight, we propose Demographic Attribute-insensitive Intervention Detection (DAID), a plug-and-play framework composed of: i) Demographic-aware data rebalancing, which employs inverse-propensity weighting and subgroup-wise feature normalization to neutralize distributional biases; and ii) Demographic-agnostic feature aggregation, which uses a novel alignment loss to suppress sensitive-attribute signals. Across three cross-domain benchmarks, DAID consistently achieves superior performance in both fairness and generalization compared to several state-of-the-art detectors, validating both its theoretical foundation and practical effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Improving generalization of deepfake detectors to unseen manipulations

Ensuring demographic fairness in deepfake detection models

Resolving trade-off between fairness and generalization via causal analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Demographic-aware data rebalancing with inverse-propensity weighting

Subgroup-wise feature normalization to neutralize biases

Demographic-agnostic feature aggregation using alignment loss

🔎 Similar Papers

No similar papers found.