DeepShield: Fortifying Deepfake Video Detection with Local and Global Forgery Analysis

📅 2025-10-29

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

Existing deepfake detectors rely heavily on domain-specific forensic traces, exhibiting poor cross-domain generalization and limited robustness against unseen manipulation techniques. To address this, we propose a robust detection framework that jointly models local and global forgery characteristics. Our approach introduces a local patch-guided mechanism for fine-grained anomaly localization and incorporates global forgery diversity modeling to enhance adaptability across manipulation types and datasets. Built upon the CLIP-ViT architecture, it integrates spatiotemporal artifact modeling, patch-level supervision, domain-aware feature enhancement, and boundary-expanded feature generation to enable multi-scale forgery analysis. Extensive evaluation demonstrates that our method significantly outperforms state-of-the-art approaches in cross-dataset and cross-manipulation benchmarks. Notably, it achieves substantial accuracy gains under zero-shot forgery detection scenarios—where training data excludes the target manipulation type—thereby markedly improving generalization to unseen attacks and overall detection robustness.

Technology Category

Application Category

📝 Abstract

Recent advances in deep generative models have made it easier to manipulate face videos, raising significant concerns about their potential misuse for fraud and misinformation. Existing detectors often perform well in in-domain scenarios but fail to generalize across diverse manipulation techniques due to their reliance on forgery-specific artifacts. In this work, we introduce DeepShield, a novel deepfake detection framework that balances local sensitivity and global generalization to improve robustness across unseen forgeries. DeepShield enhances the CLIP-ViT encoder through two key components: Local Patch Guidance (LPG) and Global Forgery Diversification (GFD). LPG applies spatiotemporal artifact modeling and patch-wise supervision to capture fine-grained inconsistencies often overlooked by global models. GFD introduces domain feature augmentation, leveraging domain-bridging and boundary-expanding feature generation to synthesize diverse forgeries, mitigating overfitting and enhancing cross-domain adaptability. Through the integration of novel local and global analysis for deepfake detection, DeepShield outperforms state-of-the-art methods in cross-dataset and cross-manipulation evaluations, achieving superior robustness against unseen deepfake attacks.

Problem

Research questions and friction points this paper is trying to address.

Detecting manipulated face videos across diverse forgery techniques

Improving generalization beyond specific training artifacts

Balancing local detail analysis with global feature robustness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Local Patch Guidance models spatiotemporal artifacts

Global Forgery Diversification augments domain features

Integrates local and global analysis for robustness

🔎 Similar Papers

No similar papers found.