A Novel Local Focusing Mechanism for Deepfake Detection Generalization

📅 2025-08-23

📈 Citations: 0

✨ Influential: 0

career value

258K/year

🤖 AI Summary

Existing deepfake detection methods exhibit poor generalization across object categories (e.g., faces → cars) and generative domains (e.g., GANs → Stable Diffusion), primarily due to CNNs’ tendency to overfit local semantic distributions and the discriminative information loss induced by global average pooling (GAP). To address this, we propose a Local-Focusing Mechanism (LFM) that employs a salience network to guide feature selection and adopts Top-K pooling to dynamically retain the most discriminative local patterns. We further introduce two regularization strategies—Rank-Based Linear Dropout and Random-K Sampling—to mitigate overfitting. Crucially, LFM eliminates GAP entirely. Evaluated on multiple cross-domain benchmarks, our method achieves a 3.7% absolute accuracy gain and a 2.8% improvement in mean average precision over state-of-the-art non-photorealistic rendering (NPR) approaches. With a throughput of 1789 FPS on a single A6000 GPU, it establishes a new benchmark for cross-domain deepfake detection.

Technology Category

Application Category

📝 Abstract

The rapid advancement of deepfake generation techniques has intensified the need for robust and generalizable detection methods. Existing approaches based on reconstruction learning typically leverage deep convolutional networks to extract differential features. However, these methods show poor generalization across object categories (e.g., from faces to cars) and generation domains (e.g., from GANs to Stable Diffusion), due to intrinsic limitations of deep CNNs. First, models trained on a specific category tend to overfit to semantic feature distributions, making them less transferable to other categories, especially as network depth increases. Second, Global Average Pooling (GAP) compresses critical local forgery cues into a single vector, thus discarding discriminative patterns vital for real-fake classification. To address these issues, we propose a novel Local Focus Mechanism (LFM) that explicitly attends to discriminative local features for differentiating fake from real images. LFM integrates a Salience Network (SNet) with a task-specific Top-K Pooling (TKP) module to select the K most informative local patterns. To mitigate potential overfitting introduced by Top-K pooling, we introduce two regularization techniques: Rank-Based Linear Dropout (RBLD) and Random-K Sampling (RKS), which enhance the model's robustness. LFM achieves a 3.7 improvement in accuracy and a 2.8 increase in average precision over the state-of-the-art Neighboring Pixel Relationships (NPR) method, while maintaining exceptional efficiency at 1789 FPS on a single NVIDIA A6000 GPU. Our approach sets a new benchmark for cross-domain deepfake detection. The source code are available in https://github.com/lmlpy/LFM.git

Problem

Research questions and friction points this paper is trying to address.

Improving deepfake detection generalization across categories and domains

Addressing overfitting to semantic features in deep CNNs

Preserving local forgery cues lost in Global Average Pooling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Local Focus Mechanism for discriminative local features

Salience Network with Top-K Pooling module

Rank-Based Linear Dropout and Random-K Sampling regularization

🔎 Similar Papers

No similar papers found.