MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection

📅 2024-04-12
🏛️ arXiv.org
📈 Citations: 11
Influential: 1
📄 PDF
🤖 AI Summary
To address three key challenges in Vision Transformer (ViT)-based face forgery detection—prohibitive computational cost of full fine-tuning, weak modeling of local forensic cues, and narrow coverage of forgery artifacts—this paper proposes a parameter-efficient and generalizable Mixture-of-Experts (MoE) architecture. Methodologically, we freeze the pre-trained ViT backbone and fine-tune only lightweight LoRA and Adapter modules; integrate global Transformer representations with local CNN priors; and introduce a dynamic routing mechanism to enable multi-granularity forgery pattern modeling and expert-wise lightweight adaptation. Our contribution is the first application of MoE to face forgery detection, enabling plug-and-play transfer across diverse ViT variants. Experiments demonstrate state-of-the-art performance on multiple benchmarks, with over 90% reduction in trainable parameters, significantly improved cross-dataset generalization, and enhanced computational efficiency.

Technology Category

Application Category

📝 Abstract
Deepfakes have recently raised significant trust issues and security concerns among the public. Compared to CNN face forgery detectors, ViT-based methods take advantage of the expressivity of transformers, achieving superior detection performance. However, these approaches still exhibit the following limitations: (1) Fully fine-tuning ViT-based models from ImageNet weights demands substantial computational and storage resources; (2) ViT-based methods struggle to capture local forgery clues, leading to model bias; (3) These methods limit their scope on only one or few face forgery features, resulting in limited generalizability. To tackle these challenges, this work introduces Mixture-of-Experts modules for Face Forgery Detection (MoE-FFD), a generalized yet parameter-efficient ViT-based approach. MoE-FFD only updates lightweight Low-Rank Adaptation (LoRA) and Adapter layers while keeping the ViT backbone frozen, thereby achieving parameter-efficient training. Moreover, MoE-FFD leverages the expressivity of transformers and local priors of CNNs to simultaneously extract global and local forgery clues. Additionally, novel MoE modules are designed to scale the model's capacity and smartly select optimal forgery experts, further enhancing forgery detection performance. Our proposed learning scheme can be seamlessly adapted to various transformer backbones in a plug-and-play manner. Extensive experimental results demonstrate that the proposed method achieves state-of-the-art face forgery detection performance with significantly reduced parameter overhead. The code is released at: https://github.com/LoveSiameseCat/MoE-FFD.
Problem

Research questions and friction points this paper is trying to address.

Efficiently fine-tuning ViT models for face forgery detection
Capturing both global and local forgery clues effectively
Enhancing generalization across diverse face forgery features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts modules for enhanced detection
Frozen ViT backbone with LoRA/Adapter parameter efficiency
Combines transformer expressivity with CNN local priors
🔎 Similar Papers
No similar papers found.
C
Chenqi Kong
Rapid-Rich Object Search (ROSE) Lab, School of Electrical and Electronic Engineering, Nanyang Technology University, Singapore, 639798
Anwei Luo
Anwei Luo
Jiangxi University of Finance and Economics
deepfakeface forgery detectionmultimedia securityforensics
Song Xia
Song Xia
NTU
Machine Learning
Y
Yi Yu
Rapid-Rich Object Search (ROSE) Lab, School of Electrical and Electronic Engineering, Nanyang Technology University, Singapore, 639798
Haoliang Li
Haoliang Li
Department of Electrical Engineering, City University of Hong Kong
AI SecurityInformation Forensics and SecurityMachine Learning
A
A. Kot
Rapid-Rich Object Search (ROSE) Lab, School of Electrical and Electronic Engineering, Nanyang Technology University, Singapore, 639798