AV-Deepfake1M++: A Large-Scale Audio-Visual Deepfake Benchmark with Real-World Perturbations

📅 2025-07-28

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

The increasing realism of deepfake videos has significantly heightened the difficulty of reliable detection. Method: To address this, we introduce 1M-Deepfakes—the first large-scale, audio-visual deepfake benchmark explicitly designed for realistic web environments—comprising 2 million high-fidelity forged video clips. It integrates state-of-the-art text-to-speech and facial reenactment models and systematically incorporates 12 types of real-world corruptions—including compression, noise, and resolution degradation—to enable controllable modeling of complex interference. Contribution/Results: Compared to existing benchmarks, 1M-Deepfakes substantially enhances diversity, photorealism, and ecological validity of forged samples, providing a more challenging and realistic training and evaluation foundation for detection algorithms. As the official benchmark for the 2025 1M-Deepfakes Detection Challenge, it advances the development of practical, robust deepfake detection systems.

Technology Category

Application Category

📝 Abstract

The rapid surge of text-to-speech and face-voice reenactment models makes video fabrication easier and highly realistic. To encounter this problem, we require datasets that rich in type of generation methods and perturbation strategy which is usually common for online videos. To this end, we propose AV-Deepfake1M++, an extension of the AV-Deepfake1M having 2 million video clips with diversified manipulation strategy and audio-visual perturbation. This paper includes the description of data generation strategies along with benchmarking of AV-Deepfake1M++ using state-of-the-art methods. We believe that this dataset will play a pivotal role in facilitating research in Deepfake domain. Based on this dataset, we host the 2025 1M-Deepfakes Detection Challenge. The challenge details, dataset and evaluation scripts are available online under a research-only license at https://deepfakes1m.github.io/2025.

Problem

Research questions and friction points this paper is trying to address.

Detect highly realistic deepfake videos and audios

Address lack of diverse deepfake datasets with perturbations

Benchmark deepfake detection using state-of-the-art methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale audio-visual deepfake benchmark dataset

Diverse manipulation strategies and perturbations

State-of-the-art deepfake detection benchmarking

🔎 Similar Papers

Audio Anti-Spoofing Detection: A Survey

2024-04-22arXiv.orgCitations: 25

Authors to Follow